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Abstract In this study, we consider limit theorems for microscopic stochastic mod- 
els of neural fields. We show that the Wilson-Cowan equation can be obtained as the 
limit in uniform convergence on compacts in probability for a sequence of micro- 
scopic models when the number of neuron populations distributed in space and the 
number of neurons per population tend to infinity. This result also allows to obtain 
limits for qualitatively different stochastic convergence concepts, e.g., convergence 
in the mean. Further, we present a central limit theorem for the martingale part of 
the microscopic models which, suitably re-scaled, converges to a centred Gaussian 
process with independent increments. These two results provide the basis for pre- 
senting the neural field Langevin equation, a stochastic differential equation taking 
values in a Hilbert space, which is the infinite-dimensional analogue of the chemical 
Langevin equation in the present setting. On a technical level, we apply recently de- 
veloped law of large numbers and central limit theorems for piecewise deterministic 
processes taking values in Hilbert spaces to a master equation formulation of stochas- 
tic neuronal network models. These theorems are valid for processes taking values in 
Hilbert spaces, and by this are able to incorporate spatial structures of the underlying 
model. 
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1 Introduction 

The present study is concerned with the derivation and justification of neural field 
equations from finite size stochastic particle models, i.e., stochastic models for the 
behaviour of individual neurons distributed in finitely many populations, in terms of 
mathematically precise probabilistic limit theorems. We illustrate this approach with 
the example of the Wilson-Cowan equation 

rv(t,x) = -v(t,x) + f^j w(x,y)v(t,y)dy + I(t,x)j. (1.1) 

We focus on the following two aspects: 

(A) Often one wants to study deterministic equations such as Eq. (1.1) in order to 
obtain results on the 'behaviour in the mean' of an intrinsically stochastic sys- 
tem. Thus, we first discuss limit theorems of the law of large numbers type for 
the limit of infinitely many particles. These theorems connect the trajectories of 
the stochastic particle models to the deterministic solution of mean field equa- 
tions, and hence provide a justification studying Eq. (1.1) in order to infer on the 
behaviour of the stochastic system. 

(B) Secondly, we aim to characterise the internal noise structure of the complex dis- 
crete stochastic models as in the limit of large numbers of neurons the noise is 
expected to be close to a simpler stochastic process. Ultimately, this yields a 
stochastic neural field model in terms of a stochastic evolution equation concep- 
tually analogous to the Chemical Langevin Equation. The Chemical Langevin 
Equation is widely used in the study of chemical reactions networks for which 
the stochastic effects cannot be neglected but a numerical or analytical study of 
the exact discrete model is not possible due to its inherent complexity. 

In this study, we understand as a microscopic model a description as a stochastic pro- 
cess, usually a Markov chain model, also called a master equation formulation (cf. [3, 
5, 8, 9, 22] containing various master equation formulations of neural dynamics). In 
contrast, a macroscopic model is a deterministic evolution equation such as(l.l). De- 
terministic mean field equations have been used widely and for a long time to model 
and analyse large scale behaviour of the brain. In their original deterministic form, 
they are successfully used to model geometric visual hallucinations, orientation tun- 
ing in the visual cortex and wave propagation in cortical slices to mention only a few 
applications. We refer to [7] for a recent review and an extensive list of references. 
The derivation of these equations is based on a number of arguments from statisti- 
cal physics and for a long time a justification from microscopic models has not been 
available. The interest in deriving mean field equations from stochastic microscopic 
model has been revived recently as it contains the possibility to derive deterministic 
'corrections' to the mean field equations, also called second-order approximations. 
These corrections might account for the inherent stochasticity, and thus incorporate 
so called finite size effects. This has been achieved by either applying a path-integral 
approach to the master equation [8, 9] or by a van Kampen system-size expansion of 
the master equation [5]. In more detail, the author in the latter reference proposes a 
particular master equation for a finite number of neuron populations and derives the 
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Wilson-Cowan equation as the first-order approximation to tlie mean via employing 
the van Kampen system size expansion and then taking the continuum limit for a con- 
tinuum of populations. In keeping also the second-order terms, a 'stochastic' version 
of the mean field equation is also presented in the sense of coupling the first moment 
equation to an equation for the second moments. 

However, the van Kampen system size expansion does not give a precise math- 
ematical connection, as it neither quantifies the type of convergence (quality of the 
limit), states conditions when the convergence is valid nor does it allow to charac- 
terise the speed of convergence. Furthermore, particular care has to be taken in sys- 
tems possessing multiple fixed points of the macroscopic equation, and we refer to 
[5] for a discussion of this aspect in the neural field setting. The limited applicabil- 
ity of the van Kampen system size expansion was already well known to Sect. 10 in 
van Kampen [33]. In parallel to the work of van Kampen, T. Kurtz derived precise 
limit theorems connecting sequences of continuous time Markov chains to solutions 
of systems of ordinary differential equations; see the seminal studies [19, 20] or the 
monograph [15]. Limit theorems of that type are usually called the fluid limit, ther- 
modynamic limit, or hydrodynamic limit; for a review, see, e.g., [13]. 

As is thoroughly discussed in [5] establishing the connection between master 
equation models and mean field equations involves two limit procedures. First, a 
limit which takes the number of particles, in this case neurons per considered popu- 
lation, to infinity (thermodynamic limit), and a second which gives the mean field by 
taking the number of populations to infinity (continuum limit). In this 'double limit', 
the theorems by Kurtz describe the connection of taking the number of neurons per 
population to infinity yielding a system of ordinary differential equation, one for each 
population. Then the extension from finite to infinite dimensional state space is ob- 
tained by a continuum limit. This procedure corresponds to the approach in [5]. Thus, 
taking the double limit step by step raises the question what happens if we first take 
the spatial limit and then the fluid limit, thus reversing the order of the limit proce- 
dures, or in the case of taking the limits simultaneously. Recently, in an extension to 
the work of Kurtz, one of the present authors and co-authors established limit the- 
orems that achieve this double limit [27], thus being able to connect directly finite 
population master equation formulations to spatio-temporal limit systems, e.g., par- 
tial differential equation or integro-differential equations such as the Wilson-Cowan 
equation (1.1). In a general framework, these limit theorems were derived for Piece- 
wise Deterministic Markov Processes on Hilbert spaces, which in addition to the 
jump evolution also allow for a coupled deterministic continuous evolution. This 
generality was motivated by applications to neuron membrane models consisting of 
microscopic models of the ion channels coupled to a deterministic equation for the 
transmembrane potential. We find that this generality is also advantageous for the 
present situation of a pure jump model as it allows to include time-dependent inputs. 
In this study, we employ these theorems to achieve the aims (A) and (B) focussing on 
the example of the deterministic limit given by the Wilson-Cowan equation (1.1). 

Finally, we state what this study does not contain, which in particular distinguishes 
the present study from [5, 8, 9] beyond mathematical technique. Presently, the aim is 
not to derive moment equations, i.e., a deterministic set of equations that approximate 
the moments of the Markovian particle model, but rather processes (deterministic or 
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Stochastic) to which a sequence of microscopic models converges under suitable con- 
ditions in a probabilistic way. This means that a microscopic model, which is close 
to the limit — presently corresponding to a large number of neurons in a large num- 
ber of populations — can be assumed to be close to the limiting processes in structure 
and pathwise dynamics as indicated by the quality of the stochastic limit. Hence, the 
present work is conceptually — though neither in technique nor results — close to [30] 
wherein using a propagation to chaos approach in the vicinity of neural field equa- 
tions the author also derives in a mathematically precise way a limiting process to 
finite particle models. However, it is an obvious consequence that the convergence of 
the models necessarily implies a close resemblance of their moment equations. This 
provides the connection to [5, 8, 9], which we briefly comment on in Appendix B. 

As a guide, we close this introduction with an outline of the subsequent sections 
and some general remarks on the notation employed in this study. In Sects. 1.1 to 1.3, 
we first discuss the two types of mean field models in more detail, on the one hand, 
the Wilson-Cowan equation as the macroscopic limit and, on the other hand, a master 
equation formulation of a stochastic neural field. The main results of the paper are 
found in Sect. 2. There we set up the sequence of microscopic models and state 
conditions for convergence. Limit theorems of the law of large numbers type are 
presented in Theorem 2.1 and Theorem 2.2 in Sect. 2.1. The first is a classical weak 
law of large numbers providing uniform convergence on compacts in probability and 
the second convergence in the mean uniformly over the whole positive time axis. 
Next, a central limit theorem for the martingale part of the microscopic models is 
presented in Sect. 2.2 characterising the internal fluctuations of the model to be of 
a diffusive nature in the limit. This part of the study is concluded in Sect. 2.3 by 
presenting the Langevin approximations that arise as a result of the preceding limit 
theorems. The proofs of the theorems in Sect. 2 are deferred to Sect. 4. The study 
is concluded in Sect. 3 with a discussion of the implications of the presented results 
and an extension of these limit theorems to different master equation formulations or 
mean field equations. 

Notations and Conventions Throughout the study, we denote by LP(D), 1 < p < 
oo, the Lebesgue spaces of real functions on a domain D cW^ , d > 1. Physically 
reasonable choices are J e {1, 2, 3}, however, for the mathematical theory presented 
the spatial dimension can be arbitrary. In the present study, spatial domains D are 
always bounded with a sufficiently smooth boundary, where the minimal assumption 
is a strong local Lipschitz condition; see [2]. For bounded domains D, this condi- 
tion simply means that for every point on the boundary its neighbourhood on the 
boundary is the graph of a Lipschitz continuous function. Furthermore, for a e N 
we denote by H°'{D) the Sobolev spaces, i.e., subspaces of L^(D), with the corre- 
sponding Sobolev norm. For a G M.^ \N we denote by H" (D) the interpolating Besov 
spaces. In this study, H~"{D) is the dual space of H"{D), which is in contrast to the 
widespread notation to denote by H~"{D), a > 0, the dual space of Hq(D). As 
usual, we have H^{D) — L^{D) — H~^{D). We thus obtain a continuous scale of 
Hilbert spaces H"(D), a e K, which satisfy that H"^{D) is continuously embed- 
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ded in H"^{D) for all ai < a2- Next, a pairing (■, Oif" denotes the inner product 
of the Hilbert space H"(D) and pairings in angle brackets {■, ■> h" denote the duality 
pairing for the Hilbert space H"{D). That is, for e H"{D) and (p e H-"{D) the 
expression ((/>, i/^>h« denotes the application of the real, linear functional 0 to i/f . Fur- 
thermore, the spaces H"(D), L-(D) and H~"(D) form an evolution triplet, i.e., the 
embeddings are dense and the application of linear functionals and the inner product 
in L^(D) satisfy the relation 

{4>,ir)H-^(4>,ir)L2 ^ct>eL\D),ireH"(D). (1.2) 

Norms in Hilbert spaces are denoted by || ■ H//", || ■ ||o is used to denote the supremum 
norm of real functions, i.e., for / : M ^ M we have ||/||o — sup^^-^ l/(z)l, and | ■ | 
denotes either the absolute value for scalars or the Lebesgue measure for measurable 
subsets of Euclidean space. Finally, we use Nq to denote the set of integers including 
zero. 

1 . 1 The Macroscopic Limit 

Neural field equations are usually classified into two types: rate-based and activity- 
based models. The prototype of the former is the Wilson-Cowan equation; see 
Eq. (1.1), which we also restate below, and the Amari equation, see Eq. (3.7) in 
Sect. 3, is the prototype of the latter. Besides being of a different structure, due to 
their derivation, the variable they describe has a completely different interpretation. 
In rate-based models, the variable describes the average rate of activity at a certain 
location and time, roughly corresponding to the fraction of active neurons at a certain 
infinitesimal area. In activity-based models, the macroscopic variable is an average 
electrical potential produced by neurons at a certain location. For a concise physical 
derivation that leads to these models, we refer to [5]. In the following, we consider 
rate-based equations, in particular, the classical Wilson-Cowan equation, to discuss 
the type of limit theorems we are able to obtain. We remark that the results are essen- 
tially analogous for activity based models. 

Thus, the macroscopic model of interest is given by the equation 

rv{t,x)^-v(t,x) + f{^j w{x,y)v{t,y)Ay + l{t,x)^, (1.3) 

where r > 0 is a decay time constant, is a gain (or response) function 

that relates inputs that a neuron receives to activity. In (1.3), the value f{z) can be 
interpreted as the fraction of neurons that receive at least threshold input. Further- 
more, w{x,y)isa weight function, which states the connectivity strength of a neuron 
located at y to a neuron located at x, and finally, lit, x) is an external input, which 
is received by a neuron at x at time t. For the weight function w; : Z) x £) — >• M and 
the external input 7, we assume that w e L^{D x D) and I e C(M-|_, L^(D)). As for 



A normed space X is continuously embedded in another normed space Y, in symbols X ^ Y,if X CY 
and there exists a constant K < oo such that ||H||y < A'||m||;c for all u e X. 
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the gain function /, we assume in this study that / is non-negative, satisfies a global 
Lipschitz condition with constant L > 0, i.e., 

\fia)- f(b)\<L\a-b\ W,beR, (1.4) 

and it is bounded. From an interpretive point-of-view, it is reasonable and con- 
sistent to stipulate that / is bounded by one — being a fraction — as well as being 
monotone. The latter property corresponds to the fact that higher input results in 
higher activity. In specific models, / is often chosen to be a sigmoidal function, e.g., 
f(z) = (1 + Q-(Pr^+P2))-i in [6] or f(z) = (tanh(^iz + pj) + l)/2 in [3], which 
both satisfy / e [0, 1]. Moreover, the most common choices of / are even infinitely 
often differentiable with bounded derivatives, which already implies the Lipschitz 
condition (1.4). 

The Wilson-Cowan equation (1.3) is well-posed in the strong sense as an integral 
equation in L^{D) under the above conditions. That is, Eq. (1.3) possesses a unique, 
continuously differentiable global solution v to every initial condition v(0) = ^ 
L?{D), i.e., V e C'([0, T], L?{D)) for all 7 > 0, which depends continuously on 
the initial condition. Furthermore, if the initial condition satisfies vq{x) e [0, ||/||o] 
almost everywhere in D, then it holds for all r > 0 that v(t, x) e (0, || / ||o) for almost 
al\ X e D. For a brief derivation of these results, we refer to Appendix A where we 
also state a result about higher spatial regularity of the solution: Let a e N be such 
that a > d/2. If now vq e H"(D) and if / is at least a-times differentiable with 
bounded derivatives and the weights and the input function satisfy w e H" (D x D) 
and / e C(R+, H"(D)), then the equation is well-posed in H"{D), i.e., for all T > 0 
in V e C'([0, T], H"{D)). In particular, this implies that the solution v is jointly 
continuous on R_|_ x D. 

1.2 Master Equation Formulations of Neural Network Models 

For the microscopic model, we concentrate on a variation of the model considered in 
[5,6], which is already an improvement on a model introduced in [1 1]. We extend the 
model including variations among neuron populations and foremost time-dependent 
inputs. We chose this model over the master equation formulations in [8, 9] as it pro- 
vides a more direct connection of the microscopic and macroscopic models; see also 
the discussion in Sect. 3. We describe the main ingredients of the model beginning 
with the simpler, time-independent model as prevalent in the literature. Subsequently, 
in Sect. 1.3 the final, time-dependent model is defined. 

We denote by P the number of neuron populations in the model. Further, we as- 
sume that the k\h neuron population consists of identical neurons which can either 
be in one of two possible states, active, i.e., emitting action potentials, and inactive, 
i.e., quiescent or not emitting action potentials. Transitions between states occur in- 
stantaneously and at random times. For all ^ = 1, . . . , P, the random variables 
denote the number of active neurons at time t . An integer / {k) is used to characterise 
the population size. This number l{k) can be interpreted as the number of neurons in 
the ^th population, at least for sufficiently large values. However, this is not accurate 
in the literal sense as it is possible with positive probability for populations to contain 
more than l{k) active neurons. Nevertheless, a posteriori the interpretation can be sal- 
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vaged from the obtained limit theorems.^ It is a corollary of these that the probability 
of more then l{k) neurons being active for some time becomes arbitrarily small for 
large enough l{k). Hence, for physiological reasonable neuron numbers the probabil- 
ity in these models of observing 'non-physiological' trajectories in the interpretation 
becomes ever smaller. 

Proceeding with notation, 0, — ■ ■ ■ > &r^) is a (unbounded) piecewise con- 
stant stochastic process taking values in Nq . The stochastic transitions from inactive 
to active states and vice versa for a neuron in population k are governed by a con- 
stant inactivation rate r ~ ' > 0 — uniformly for all populations — and inputs from other 
neurons depending on the current network state. This non-negative activation rate is 
given by r~^l{k)fj^{9) for 0 e Ng . For the definition of we consider weights 
Wicj, k, j = 1, . . . , P, which weigh the input one neuron in population k receives 
from a neuron in population j . Then the activation rate of a neuron in population k is 
proportional to 



for a non-negative function / : M ^ R, which obviously corresponds to the gain 
function / in the Wilson-Cowan equation (1.3). We remark that here / is not the 
rate of activation of one neuron. In this model, the activation rate of a population 
is not proportional to the number of inactive neurons but it is proportional to l(k), 
which stands for the total number of neurons in the population. In [5], this rate is thus 
interpreted as the rate with which a neuron becomes or remains active. 

It follows that the process (0t)t>o is a continuous-time Markov chain which is 
usually defined via the following master equation, where ek denotes the ^th basis 
vector of K^, 



which is endowed with the boundary conditions F[0, t] = 0 if 9 ^ Ng . In (1.6), the 
variable ¥[9, t] denotes the probability that the process 0t is in state 9 at time t. 
Finally, the definition is completed with stating an initial law £, the distribution of 
©0, i.e., providing an initial value for the ODE system (1.6). 

Another definition of a continuous-time Markov chain is via its generator; see, 
e.g., [15]. Although the master equation is widely used in the physics and chemical 
reactions literature the mathematically more appropriate object for the study of a 
Markov process is its generator and the master equation is an object derived from 



The derivation of limit theorems for bounded populations sizes, where l(k) actually is the number of neu- 
rons per population, is much more delicate than the subsequent presentation as the transition rate functions 
become discontinuous. Although this would be a desirable result, we have not yet been able to prove such 
a theorem, though it is clear that the Wilson-Cowan equation would be the only possible limit. See also a 
discussion of this aspect in Sect. 3.2. 




dP[0,f] 
~~dt 




k=l 



- (9'' + l{k)f,,{9))F[9, t] + + l)P[9 + ek, t]) 



(1.6) 
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the generator, see Sect. V in [33]. The generator of a Markov process is an operator 
defined on the space of real functions over the state space of the process. For the 
above model defined by the master equation (1.6), the generator is given by 



[ {g(^)-g(e))fz(0,d^) (1.7) 



for all suitable g : Nq K. For details, we refer to [15]. Here, X is the total instanta- 
neous jump rate, given by 



1 P 

m-^-J2i^''+i(k)f,(e)), (1.8) 

k=l 

and defines the distribution of the waiting time until the next jump, i.e., 

F[0,+s = 0r e [0, Z\? ] 1 0, = 0] = e-^^*^'^' . 

Further, the measure in (1.7) is a Markov kernel on the state space of the process 
defining the conditional distribution of the post-jump value, i.e., 

¥[0,eA\0,^0,-]^n{0,-,A) (1.9) 

for all sets A c Nq . In the present case for each 9 , the measure is given by the 
discrete distribution 

/x(0,{0-q})= ^ ^ 



rX(e) 

(1-10) 

m(0,{0 + q}) = -^^^J{^ Vfc=l,...,P. 

T ) 

The importance of the generator lies in the fact that it fully characterises a Markov 
process and that convergence of Markov processes is strongly connected to the con- 
vergence of their generators; see [15]. 

1.3 Including External Time-Dependent Input 

Until now, the microscopic model does not incorporate any time-dependent input into 
the system. In analogy to the macroscopic equation (1.3), this input enters into the 
model inside the active rate function /^,. Thus, let Ik{t) denote the external input into 
a neuron in population k at time t, then the time-dependent activation rate is given by 

7k(0,t) = f(j2^kjO' + lk(t)^. (1.11) 

The most important qualitative difference when substituting (1.5) by (1.1 1) is that the 
corresponding Markov process is no longer homogeneous. In particular, the waiting 
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time distributions in between jumps are no longer exponential, but satisfy 

V[0,+, = 0, e [0, At]\©, = e] = 

Hence, the resulting process is an inhomogeneous continuous-time Markov chain; 
see, e.g.. Sect. 2 in [36]. It is straightforward to write down the corresponding master 
equation analogously to (1.6) yielding a system of non-autonomous ordinary differ- 
ential equations, cf. the master equation formulation in [8]. Similarly, there exists 
the notion of a time-dependent generator for inhomogeneous Markov processes, cf. 
Sect. 4.7 in [15]. Employing a standard trick, that is, suitably extending the state 
space of the process, we can transform a inhomogeneous to a homogeneous Markov 
process [15, 28]. That is, the space-time process Y, := (Or,t) is again a homoge- 
neous Markov process. The initial law of the associated space-time process is £ x Sq 
on X R_|_ . We emphasise that definitions of the space-time process and its initial 
law imply that the time-component starts at 0 a.s. and, moreover, moves continuously 
and deterministically. That is, the trajectories satisfy in between jumps the differential 
equation 




where the jump intensity k is given by the sum of all individual time-dependent 
rates analogously to (1.8). Finally, the post jump value is given by a Markov ker- 
nel fjL((9, t), ■) X Sf as there clearly do not occur jumps in the progression of time and 
/X is the obvious time-dependent modification of (1.10). 

It thus follows, that the space-time process (0i, t)i>o is a homogeneous Piecewise 
Deterministic Markov Process (PDMP); see, e.g., [14, 16, 26]. This connection is 
particularly important as we apply in the course of the present study limit theorems 
developed for this type of processes; see [27]. Finally, for the space-time process 
(0t, t)t>o, we obtain for suitable functions g : Nq x M the generator 

Agie,t) = d,g(e,t) + He,t) [{g(^,t)-g(,6,t))fx{(9,t),d^). (1.12) 



2 A Precise Formulation of the Limit Theorems 

In this section, we present the precise formulations of the limit theorems. To this 
end, we first define a suitable sequence of microscopic models, which gives the 
connection between the defining objects of the Wilson-Cowan equation (1.3) and 
the microscopic models discussed in Sect. 1.2. Thus, (I'")r>o = , t)t>o, n e N, 
denotes a sequence of microscopic PDMP neural field models of the type as de- 
fined in Sect. 1.3. Each process {Y")t>o is defined on a filtered probability space 
(f2" , T" , (T")t>o, P"), which satisfies the usual conditions. Hence, the defining ob- 
jects for the jump models are now dependent on an additional index n. That is P{n) 
denotes the number of neuron populations in the «th model, l{k, n) is the number of 
neurons in the kth population of the «th model and analogously we use the notations 
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wlj and Iji^n and fp. „ . However, we note from the beginning that the decay rate 
is independent of n and r is the time constant in the Wilson-Cowan equation (1.3). 
In the following paragraphs, we discuss the connection of the defining components 
of this sequence of microscopic models to the components of the macroscopic limit. 

Connection to the Spatial Domain D A key step of connecting the microscopic 
models to the solution of Eq. (1.3) is that we need to put the individual neuron pop- 
ulations into relation to the spatial domain D the solution of (1.3) lives on. To this 
end, we assume that each population is located within a sub-domain of D and that 
the sub-domains of the individual populations are non-overlapping. Hence, for each 
n e N, we obtain a collection I?„ of P(n) non-overlapping sub-sets of D denoted 
by Di „, . . . , Dp(n),n- We assume that each subdomain is measurable and convex. 
The convexity of the sub-domains is a technical condition that allows us to apply 
Poincare's inequality, cf. (4.1). We do not think that this condition is too restrictive 
as most reasonable partition domains, e.g., cubes, triangles, are convex. Furthermore, 
for all reasonable domains D, e.g., all Jordan measurable domains, a sequence of con- 
vex partitions can be found such that additionally the conditions imposed in the limit 
theorems below are also satisfied. One may think of obtaining the collection I?„ by 
partitioning the domain into P{n) convex sub-domains D\ n, ■■ ■ , Dp(„) „ and con- 
fining each neuron population to one sub-domain. However, it is not required that the 
union of the sets in V,, amounts to the full domain D nor that the partitions consists 
of refinements. Necessary conditions on the limiting behaviour of the sub-domains 
are very strongly connected to the convergence of initial conditions of the models, 
which is a condition in the limit theorems; see below. For the sake of terminological 
simplicity, we refer to T>„ simply as the partitions. 

We now define some notation for parameters characterising the partitions V,, : the 
minimum and maximum Lebesgue measure, i.e., length, area, or volume depending 
on the spatial dimension, is denoted by 

i;_(«):= min |£>^,„|, v+(n):^ max \Dk,n\, (2.1) 

k=l,...,P(n) k=l....,P(n) 

and the maximum diameter of the partition is denoted by 

i5+(«) := max diam(Dicn), (2.2) 

where the diameter of a set „ is defined as diam(D/; „) := sup,. yfzDj^^ \x — y\. In 
the special case of domains obtained by unions of cubes with edge length n~^, it 
obviously holds that v±(n) — n~'^ and &+{n) — \fdn~^ . It is a necessary condition 
in all the subsequent limit theorems that lim„_j.oo ^+(«) = 0. This condition implies 
on the one hand that lim„^oo i'+(m) = 0 as the Lebesgue measure of a set is bounded 
in terms of its diameter, and on the other hand — at least in all but degenerate cases 
due to the necessary convergence of initial conditions that lim„_j.oo P{n) — oo. That 
is, in order to obtain a limit the sequence of partitions usually consists of ever finer 
sets and the number of populations diverges. Finally, each domain Z)^. „ of the par- 
tition Vn contains one neuron population 'consisting' of l(k, «) e N neurons. Then 
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we denote by l±{n) the maximum and minimum number of neurons in populations 
corresponding to the nth model, i.e., 

£_(«):= min l(k,n), lj.{n):— max l(k,n). (2.3) 

/t=l,...,P(«) k=l,...,P{n) 

Connection to the Weight Function ui We assume that there exists a function w : 
D y. D M. such that the connection to the discrete weights is given by 

•= / ( / w(x,y)dy)dx, (2.4) 

where w is the same function as in the Wilson-Cowan equation (1.3). For the defini- 
tion of activation rate at time t, we thus obtain 



fk,r {e\t):^f\^ Wlj — + hAt)j- (2.5) 

As already highlighted by Bressloff [5], the transition rates are not uniquely defined 
by the requirement that a possible limit to the microscopic models is given by the 
Wilson-Cowan equation (1.3). If in (2.5), the definition of the transition rates is 
changed to 

where /", « e N, is a sequence of functions converging uniformly to /, then all 
limit theorems remain valid. The proof can be carried out as presented adding and 
subtracting the appropriate term where the additional difference term vanishes due to 
sup^gjj \f"{x) — /(x)| ^ 0 for n ^ oo. Hence, any microscopic model with gain 
rates „ of such a form reduces to the same Wilson-Cowan equation in the limit. 
Clearly, the same applies analogously to the decay rate t, the weights w, and the 
input /. 



Connection to the Input Current I The external input which is applied to neurons 
in a certain population is obtained by spatially averaging a space-time input over the 
sub-domain that population is located in, i.e., 

lkAt)-^Ti^J I{t,x)dx. (2.6) 

I^A-.nl J Dk,n 

This completes the definition of the Markov jump processes (0f , r)f>o- For the 
sake of completeness, we repeat the definition of the total jump rate 

1 P 

X'\e\t) := -^(0^." +/(^, „)/,_„ 
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and the transition measure /x" is defined by 

M"((0",f),{0"+e,}) := 
foralU= l,...,P(n). 



1 



2k, n 



Connection to the Solution v As functions of time, tiie paths of the PDMP 
(0f , f)r>o ™d the solution v live on different state spaces. The former takes val- 
ues in Nq X M+ and the latter in L^(D). Thus, in order to compare these two, we 
have to introduce a mapping that maps the stochastic process onto L?(D). In [27], 
the authors called such a mapping a coordinate function, which is also the terminol- 
ogy used in [13]. In fact, the limit theorems we subsequently present actually are for 
the processes we obtain from the composition of the coordinate functions with the 
PDMPs. Here, it is important to note that for each n e N the coordinate functions 
may — and usually do — differ, however, they project the process into the common 
space L?(D). For the mean field models, we define the coordinate functions for all 
n e N by 

p 1^ ^ 

v« : ^ lHd) : 0" ^ ^ T^^^m ' ^2.7) 
k=i ^ ' ' 

Clearly, each v" is a measurable map into L^{D). For the composition of v" with the 
stochastic process (0" , t)t>o, we also use the abbreviation u" := v"(0"), and hence 
the resulting stochastic process (vf),>Q is an adapted cadlag process taking values 
in L^(D). This process thus states the activity at a location x e D as the fraction of 
active neurons in the population, which is located around this location. 

Connection of the Initial Conditions One condition in the subsequent limit theorems 
is the convergence of initial conditions in probability, i.e., the assumption that 

^^limP«[||u"(0«)-vo|L.(^)>e]=O Ve > 0. (2.8) 

It is easy to see that such a sequence of initial conditions 0q , « e N, can be found 
for any deterministic initial condition vq under some reasonable conditions on the 
domain D and the sequence of partitions I?„ . Hence, the assumption (2.8) can always 
be satisfied. For example, we may define such a sequence of initial conditions by 



0k,n 
Q = argmin 

i = l....,l{k,n) 



'^k.n\ J Dk,n 



(0, x) dx 



l{k,n) \Dk, 

Next, assuming that partitions fill the whole domain D for n oo, i.e., lim„^oo \ D\ 
iJk=i ^k,n \ — 0, and that the maximal diameter of the sets decreases to zero, i.e., 
lim„^oo^+(n) = 0, it is easy to see using the Poincare inequality (4.1) that the 
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above definition of the initial condition implies that \\Vq — v(0)||^2(£,) 0 and 
sup„gN 11*^0 IIl2(£)) < oo for all r > 1. Then (2.8) holds trivially as the initial con- 
dition is deterministic and converges. A simple non-degenerate sequence of initial 
conditions is obtained by choosing random initial conditions with the above value as 
their mean and sufficiently fast decreasing fluctuations. Furthermore, a sequence of 
partitions, which satisfy the above conditions also exists for a large class of reason- 
able domains D. Assume that D is Jordan measurable, i.e., a bounded domain such 
that the boundary is a Lebesgue null set, and let C„ be the smallest grid of cubes 
with edge length l/n covering D. We define I?„ to be the set of all cubes, which 
are fully in Z). As £) is Jordan measurable, these partitions fill up D from inside and 
(5-(_(m) 0. For a more detailed discussion of these aspects, we refer to [26]. 

In the remainder of this section, we now collect the main results of this article. We 
start with the law of large numbers, which establishes the connection to the determin- 
istic mean field equation, and then proceed to central limit theorems which provide 
the basis for a Langevin approximation. The proofs of the results are deferred to 
Sect. 4. 

2.1 A Law of Large Numbers 

The first law of large numbers takes the following form. Note that the assumptions 
imply that the number of neuron populations diverges. 

Theorem 2.1 (Law of large numbers) Let w e L^{D) x L^{D) and I e Lf^^(R+, 
H^(D)). Assume that the sequence of initial conditions converges to v(0) in proba- 
bility in the space L^(D), i.e., (2.8) holds, that E" 0g'" < l(k, n), and that 

lim3+(n) = 0, lim£_(M) = oo (2.9) 

holds. Then it follows that the sequence of L^i^D) -valued jump-process (vf )f>o con- 
verges uniformly on compact time intervals in probability to the solution v of the 
Wilson-Cowan equation (1.3), i.e., for all T,e > 0 it holds that 

lim P"!" sup II V," - vit)\\ 2rr,. > el =0. (2.10) 

Moreover, if for r > I the initial conditions satisfy in addition sup^g^j E" || H l2(D) ^ 
oo, then convergence in the rth mean holds, i.e., for all T > 0 

lim E" sup IK - v(0|K2fn) =0. (2.11) 
telO.T] 

Remark 2.1 The norm of the uniform convergence sup,g[o r] II ' \\l'^(D)^ which we 
used in Theorem 2. 1 is a very strong norm on the space of L^(Z)) -valued cadlag func- 
tions on [0, r] . Hence, due to continuous embeddings, the result immediately extends 
to weaker norms, e.g., the norms L''{(0, T), L?{D)) for all 1 < < oo. Also, for the 
state space, weaker spatial norms can be chosen, e.g., LP{D) with 1 < p < 2 or any 
norm on the duals H~"(D) of Sobolev spaces with a > 0. If weaker norms for the 



^ Springer 



Page 14 of 54 



M.G. Riedler, E. Buckwar 



State space are considered, it is possible to relax the conditions of Theorem 2. 1 by 
sharpening some estimates in the proof of the theorem. The results in the following 
corollary cover the whole range of a > 0 and splits it into sections with weakening 
conditions. In particular note that after passing to weaker norms, the convergence 
does not necessitate that the neuron numbers per population diverge. However, re- 
garding the divergence of the neuron populations, this condition (3+ (n) 0) cannot 
be relaxed. 



Corollary 2.1 Let a>0 and set 



q := 



^ if0<a<dl2, 
1— ifa — d/2, 
1 ifd/2 <a < oo. 



(2.12) 



Further, assume that w e L'l (D) x L?{D) and I e Lj^^^(M-(_, H^(D)) and that the 
sequence of initial conditions converges to v(0) in probability in the space H~"{D), 
that lim„^oo ^+(n) — 0 and 



Mm, 



:0 if0<a<d/2. 



lim„^oo =0 i/a = d/2, 

lim„^oo 7^ = 0 if d/2 <a <oo, 



(2.13) 



where 1 — denotes an arbitrary positive number strictly smaller than 1 . Then it holds 
for all T,e > 0 that 



lim I 

n—^oo 



'"\ sup \\v',' -v(t)\\^_ >el=0 
'-temj] 



and for r > I, if the additional boundedness assumptions of Theorem 2. 1 are satisfied, 
that for all T > 0 

lim E" sup ||v," - V(f)ll =0- 
te[O.T] ^ ' 



Remark 2.2 We believe that fruitful and illustrative comparisons of these conver- 
gence results and their conditions to the results in Kotelenez [17, 18], and particularly, 
Blount [4] can be made. Here, we just mention that the latter author conjectured the 
conditions (2.13) to be optimal for the convergence, but was not able to prove this 
result in his model of chemical reactions with diffusions for the region a e (0, d/2]. 
For our model, we could achieve these rates. 



2.7.7 Infinite-Time Convergence 



In the law of large numbers. Theorem 2.1, and its Corollary 2.1 we have presented 
results of convergence over finite time intervals. Employing a different technique, 
we are also able to derive a convergence result over the whole positive time axis 
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motivated by a similar result in [32]. The proof of the following theorem is deferred 
to Sect. 4.3. Restricted to finite time intervals, the subsequent result is strictly weaker 
than Theorem 2.1. However, the result is important when one wants to analyse the 
mean long time behaviour of the stochastic model via a bifurcation analysis of the 
deterministic limit as (2.14) suggests that E"v" is close to v{t) for all times r > 0 for 
sufficiently large n . 

Theorem 2.2 Let a >0 and assume that the conditions of Corollary 2. 1 are satis- 
fied. We further assume that the current input function I e L^^^ (M.^ , H ' (D)) satisfies 
II Vi/||^oo(]u^ Li(D)) < square integrable in H\D) over bounded inter- 

vals, and possesses first spatial derivatives bounded for almost all t > 0 in L?{D). 
Then it holds that 

lim supE" II v" - v(t) II = 0. (2.14) 

2.2 A Martingale Central Limit Theorem 

In this section, we present a central limit theorem for a sequence of martingales as- 
sociated with the jump processes v" . A brief, heuristic discussion of the method of 
proof for the law of large numbers explains the importance of these martingales and 
motivates their study. In the proof of the law of large numbers, the central argument 
relies on the fact that the process (vf )r>o satisfies the decomposition 

< = f'^K.^) I (v"(?)-v"(6);'))M"((6);,5),d^)d. + M;'. (2.15) 

Jo Jn^ 

Here, the process (M")t>o is a Hilbert space- valued, square-integrable, cadlag mar- 
tingale using (2.15) as its definition. We have used this representation of the process 
v" in the proof of Theorem 2.2; see Sect. 4.3. We note that the Bochner integral in 
(2.15) is a.s. well defined due to bounded second moments of the integrand; see (4.7) 
in the proof of Theorem 2. 1 . Now an heuristic argument to obtain the convergence 
to the solution of the Wilson-Cowan equation is the following: The initial condi- 
tions converge, the martingale term M" converges to zero and the integral term in 
the right-hand side of (2.15) converges to the right-hand side in the Wilson-Cowan 
equation (1.3). Hence, the 'solution' v" of (2.15) converges to the solution v of the 
Wilson-Cowan equation (1.3). Now interpreting Eq. (2.15) as a stochastic evolution 
equation, which is driven by the martingale (M"),>o sheds light on the importance 
of the study of this term. Because, from this point of view, the martingale part in 
the decomposition (2.15) contains all the stochasticity inherent in the system. Then 
the idea for deriving a Langevin or linear noise approximation is to find a stochas- 
tic non-trivial limit (in distribution) for the sequence of martingales and substituting 
heuristically this limiting martingale into the stochastic evolution equation. Then it 
is expected that this new and much less complex process behaves similarly to the 
process {v")r>o for sufficiently large n. Deriving a suitable limit for (M"),>o is what 
we set to do next. The result can be found in Theorem 2.3 below and takes the form 
of a central limit theorem. 
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First of all, what has been said so far implies the necessity of re-scaling the martin- 
gale with a diverging sequence in order to obtain a non-trivial limit. The conditions in 
the law of large numbers imply in particular that the martingale converges uniformly 
in the mean square to zero, i.e., 

limE" sup llMrll,, =0, 
telO.T] 

which in turn implies convergence in probability and convergence in distribution to 
the zero limit. 

Furthermore, in contrast to Euclidean spaces norms on infinite-dimensional spaces 
are usually not equivalent. In Corollary 2.1, we exploited this fact as it allowed us to 
obtain convergence results under less restrictive conditions by changing to strictly 
weaker norms. In the formulation and proof of central limit theorems, the change 
to weaker norms even becomes an essential ingredient. It is often observed in the 
literature, see, e.g., [4, 17, 18] that central limit theorems cannot be proven in the 
strongest norm for which the law of large numbers holds, e.g., L?(D) in the present 
setting, but only in a strictly weaker norm. Here, this norm is the norm in the dual 
of an appropriate Sobolev space. Hence, from now on, we consider for all n e N the 
processes (v"),>o and the martingales (M,"),>o as taking values in the space H~" (D) 
for ana > d, where d is the dimension of the spatial domain D, using the embedding 
of L^(D) into H~"(D). The technical significance of the restriction a > d is that 
these are the indices such that there exists an embedding H"{D) into a H'^^(D) 
with d/2 <aY < at, which is of Hilbert-Schmidt type^ due to Maurin's theorem and 
(D) is embedded into C(D) due to the Sobolev embedding theorem. These two 
properties are essential for the proof of the central limit theorem and their occurrence 
will be made clear subsequently. 

The limit we propose for the re-scaled martingale sequence is a centred diffusion 
process in H~" (D), that is, a centred continuous Gaussian stochastic process (Xi)i>o 
taking values in H~"{D) with independent increments and given covariance C(t), 
t >0; see, e.g., [12, 25] for a discussion of Gaussian processes in Hilbert spaces. Such 
a process is uniquely defined by its covariance operator and conversely, each family 
of linear, bounded operators C(t) : H"{D) H~"{D), t >0, uniquely defines a 
diffusion process** if 



A continuous embedding of two Hilbert spaces X ^ Y is of Hilbert-Schmidt type if for every orthonor- 

mal basis (pj, j 6 N, of X it holds that YlfLi Wfj lly < Then, more precisely, Maurin's theorem states 

that for non-negative integers m, k, the embedding of (D) into H'" (D) is of Hilbert-Schmidt type 

for k > d/2; see [2]. The result was generalised to fractional order Sobolev spaces in [35]: Let D be a 
bounded, strong local Lipschitz domain in R'' and 0 < ai < o'2 are real numbers. Then it holds that the 
embedding of H"i+''I^(D) into H"l (D) is of Hilbert-Schmidt type. 

^Usually, the covariance operator for a Hilbert space-valued process is an operator mapping from the state 
space into the state space and not into the dual, i.e., in the present situation mapping H~"(D) into itself. 
Due to the canonical embedding of Hilbert spaces into their dual and the Riesz representation, however, we 
can effortless change from the usual definition to ours and vice versa. Moreover, the symmetry condition 
thus implies due to the Hellinger-Toeplitz theorem that the operator is self-adjoint, and hence of trace class 
if and only if (2.16) is satisfied. The choice of the presentation here is due to the fact that it is simpler to 
evaluate the duality pairing on H~"(D) than the inner product thereon, as the former usually is just the 
inner product in L^(D). 
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(i) each C(t) is symmetric and positive, i.e., 

(ii) each C(f) is of ?race class, i.e., for one (and thus every) orthonormal basis (pj, 
j e N, in H"{D) it holds that 

oo 

J2{Cit)9j,9j)H<'(D) < (2.16) 

7=1 

(iii) and the family C(r), f > 0, is continuously increasing in f in the sense that the 
map 1 1-> {C{t)(p, ir)H''(D) is continuous and increasing for all </), i/r G H"{D). 

We next define the process, which will be the limit identified in the martingale 

central limit theorem via its covariance. In order to define the operator C, we first 
define a family of linear operators G(y(f),f) mapping from H"(D) into the dual 
space H~" (D) via the bilinear form 

(G(v(O,f)0,^)^„(^, 
= j (p{x)^^vit,x) 

+ \f(^j^ "'(^' y>^t' y)^y + ^(^ x)^^f{x)dx. (2.17) 

It is obvious that this bilinear form is symmetric and positive and, as v(f) is con- 
tinuous in t, it holds that the map t {G(v(t), t)(j), \I/)h''(D) is continuous for all 
(p,i/ e H"(D). Furthermore, it is easy to see that the operator is bounded, i.e., 

||G(v(O,0llL(/f<'(D),H-<'(D)) = ^"P ^"P \{G{vit),t)<p,f)^„^jj^\<oo, 

II0IIh'«(D)=1 II'/'IIh'«(Z))=1 

as the solution of the Wilson-Cowan equation v and the gain function / are 
pointwise bounded. Hence, due to the Cauchy-Schwarz inequality, the norm 
\{G(v{t), t)(f>, ip-) ^"(0)1 is proportional to the product ||0llL2(£,-)||i/f||i2(£)) and for 
any a > 0 the Sobolev embedding theorem gives now a uniform bound in terms of 
the norm of (j), ^ in H" (D). As a final property, we show that these operators are of 
trace-class if a > d/2. Thus, let ((pj)j^jq be an orthonormal basis in H"{D), then the 
Cauchy-Schwarz inequality yields 

|(G(v(f),f)^;,^,-)^„(^)| < ^(l + ll/llo)l^)|ll^;lli2(a). 

Summing these inequalities for all y e N, we find that the resulting right-hand side is 
finite as due to Maurin's theorem the embedding of H"(D) into L^(D) is of Hilbert- 
Schmidt type. Moreover, their trace is even bounded independently of t. 

Now, it holds that the map ? i-> G(y(f), t) is continuous taking values in the Ba- 
nach space of trace class operators, hence we define trace class operators C(f) from 
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into H-"{D) via the Bochner integral for all r > 0 

C(f):= I G{v{s),s)As. (2.18) 

Clearly, the resulting bilinear form {C{t)-, ■)h''(D) inherits the properties of the bi- 
linear form (2.17). Moreover, due to the positivity of the integrands, it follows that 
(C(f)0, 4>)h''(D) is increasing in t for all 0 e H"(D). Hence, the family of opera- 
tors C(?), f > 0, satisfies the above conditions (i)-(iii), and thus uniquely defines an 
//~"(D)-valued diffusion process. 

We are now able to state the martingale central limit theorem. The proof of the 
theorem is deferred to Sect. 4.4. 



Theorem 2.3 (Martingale central limit theorem) Let a > d and assume that the con- 
ditions of Theorem 2.1 are satisfied. In particular, convergence in the mean holds, 
i.e., (2.11) holds for r = 1. Additionally, we assume it holds that 

V-in) £-(n) 

Um — — — ^ = 1. (2.19) 
n-j-oo v+(n) i+in) 

Then it follows that the sequence of re-scaled H~" {D)-valued martingales 



converges weakly on the space of H~" (D) -valued cadldg function to the H~"{D)- 
valued diffusion process defined by the covariance operator C{t) given by (2.18). 

Remark 2.3 In connection with the results of Theorem 2.3, two questions may arise. 
First, in what sense is there uniqueness of the re-scaling sequence, and hence of 
the limiting diffusion? That is, does a different scaling also produce a (non-trivial) 
limit, or, rephrased, is the proposed scaling the correct one to look at? Secondly, 
the theorem deals with the norms for the range of a > d in the Hilbert scale, what 
can be said about convergence in the stronger norms corresponding to the range of 
a e [0, d]l Does there exist a limit? We conclude this section addressing these two 
issues. 

Regarding the first question, it is immediately obvious that the re-scaling sequence 
^"^^1 , which we denote by p„ in the following, is not a unique sequence yielding a 
non-trivial limit. Re-scaling the martingales M" by any sequence of the form ^cp„ 
yields a convergent martingale sequence. However, the limiting diffusion differs only 
in a covariance operator, which is also re-scaled by c, and hence the limit is es- 
sentially the same process with either 'stretched' or 'shrinked' variability. However, 
the asymptotic behaviour of the re-scaling sequences, which allow for a non-trivial 
weak limit is unique. In general, by considering different re-scaling sequences p*, 
we obtain three possibilities for the convergence of the sequence y^M" . If p* is 
of the same speed of convergence as p„, i.e., for p* = 0(pn), the thus re-scaled se- 
quence converges again to a diffusion process for which the covariance operator is 
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proportional to (2.18). This is then just a re-scaling by a sequence (asymptotically) 
proportional to p„ as discussed above. Secondly, if the convergence is slower, i.e., 
p* = o{pn), then the same methods as in the law of large numbers show that the 
sequence converges to zero uniformly on compacts in probability, hence also con- 
vergence in distribution to the degenerate zero process follows. Thus, one only ob- 
tains the trivial limit. Finally, if we rescale by a sequence that diverges faster, i.e., 
p„ = o(p*), we can show that there does not exist a limit. This follows from general 
necessary conditions for the preservation of weak limits under transformation, which 
presuppose that yp*/p„M has to converge in distribution in order for y^p^M,, pos- 
sessing a limit in distribution; see Theorem 2 in [29]. As the sequence p* /p„ diverges, 
this is clearly not possible to hold. 

Unfortunately, an answer to the second question is not possible in this clarity, 
when considering non-trivial limits. Essentially, we can only say that the currently 
used methods do not allow for any conclusion on convergence. The limitations are the 
following: The central problem is that for the parameter range a e [0, d] the current 
method does not provide tightness of the re-scaled martingale sequence, hence we 
cannot infer that the sequence possesses a convergent subsequence. However, if tight- 
ness can be established in a different way then for the range a e (max{l, d/2}, d], the 
limit has to be the diffusion process defined by the operator (2.18) as follows from 
the characterisation of any limit in the proof of the theorem. Here, the lower bound 
of max{l, d/2} results, on the one hand, from our estimation technique, which ne- 
cessitates a > 1, and on the other hand, from the definition of the limiting diffusion. 
Recall that the covariance operator is only of trace class for a > d/2. Hence, for 
a e[0,d /2], we can no longer infer that the limiting diffusion even exists. 

2.3 The Mean-Field Langevin Equation 

An important property of the limiting diffusion in view toward analytic and numerical 
studies is that it can be represented by a stochastic integral with respect to a cylin- 
drical or 2-Wiener process. For a general discussion of infinite-dimensional stochas- 
tic integrals, we refer to [12]. First, let (Wt)t>o be a cylindrical Wiener process on 
H~"(D) with covariance operator being the identity. Then G(v(t), t) o is a trace 
class operator on H~"{D) for suitable values of a. Here, : H~"(D) H"(D) 
is the Riesz representation, i.e., the usual identification of a Hilbert space with its 
dual. The operator G(v(t),t) o possesses a unique square-root we denote by 
G(v{t), t) o which is a Hilbert-Schmidt operator on H~°'{D). It follows that 
the stochastic integral process 



is a diffusion process in H~"{D) with covariance operator C(t). That is, (Zt)t>o is 
a version of the limiting diffusion in Theorem 2.3. Now, formally substituting for the 
limits in (2.15) yields the linear noise approximation 




(2.20) 



C/f = vo + 




o ( 



idW, 
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or in differential notation 

dU, = r~\U, + F{U,,t)) dt + en^G{v{t), t) o ("i dW,, C/q = vq, (2.21) 

where 6„ = y/v+(n)/t-(n) is small for large n. Here, we have used the operator 
notation 

F : H-^iD) X K+ ^ H-^iD) : F{g, t){x) = f{{g, w{x, ■))^,^^^ + I(t, x)). 

Equation (2.21) is an infinite-dimensional stochastic differential equation with addi- 
tive (linear) noise. Here, additive means that the coefficient in the diffusion term does 
not depend on the solution Ui . A second formal substitution yields the Langevin ap- 
proximation. Here, the dependence of the diffusion coefficient on the deterministic 
limit V is formally substituted by a dependence on the solution. That is, we obtain a 
stochastic partial differential equation with multiplicative noise given by 

V,^V(i+ [ T-\Vs + F(Vs,s))ds + €n [ ^G(Vs,s)oL-^dWs, 

Jo Jo 
or in differential notation 

dV, = T"^ {V, + F(V,, t)) dt + €„^G{V,,t)oL-^dW,. (2.22) 

Note that the derivation of the above equations was only formal, hence we have to 
address the existence and uniqueness of solutions and the proper setting for these 
equations. This is left for future work. It is an ongoing discussion and probably un- 
decidable as lacking a criterion of approximation quality which — if any at all — is 
the correct diffusion approximation to use. First of all note that for both versions the 
noise term vanishes for n ^ oo, and thus both have the Wilson-Cowan equation as 
their limit. And also, neither of them approximates even the first moment of the mi- 
croscopic models exactly. This means that for neither we have that the mean solves 
the Wilson-Cowan equation, which would be only the case if / were linear. How- 
ever, they are close to the mean of the discrete process. We discuss this aspect in 
Appendix B. 

Furthermore, we already observe in the central limit theorem, and thus also in the 
linear noise and Langevin approximation that the covariance (2.18) or the drift and 
the structure of the diffusion terms in (2.21) and (2.22), respectively, are independent 
of objects resulting from the microscopic models. They are defined purely in terms 
of the macroscopic limit. This observation supports the conjecture that these approx- 
imations are independent from possible different microscopic models converging to 
the same deterministic limit. Analogous statements hold also for derivations from the 
van Kampen system size expansion [5] and in related limit theorems for reaction dif- 
fusion models [4, 17, 18]. The only object reminiscent of the microscopic models in 
the continuous approximations is the re-scaling sequence 6„ . However, the re-scaling 
is proportional to the square root of l-(n)/v+(n), i.e., the number of neurons per area 
divided by the size of the area, which is just the local density of particles. Therefore, 
in the approximations, the noise scales inversely to the square root of neuron density 
in this model, which interpreted in this way can also be considered a macroscopic 
fixed parameter and chosen independently of the approximating sequence. 
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Remark 2.4 The stochastic partial differential equations (2.21) and (2.22), which we 
proposed as the linear noise or Langevin approximation, respectively, are not neces- 
sarily unique as the representation of the limiting diffusion as a stochastic integral 
process (2.20) may not be unique. It will be subject for further research efforts to 
analyse the practical implications and usability of this Langevin approximation. Let 

2 be a trace class operator, (Wf^)t>o be a Q-Wiener process and let B(v(t), t) be 
operators such that B(v(t), t) o Q o B{v{t), t)* — G(v(t), t) o where * denotes 
the adjoint operator. Then also the stochastic integral process 

f B{vis),s)dWP 
Jo 

is a version of the limiting diffusion in (2.3) and the corresponding linear noise and 
Langevin approximations are given by 

dUp = r-^Up + F(Up, t)) dt + €„5(v(f), t) dWp 

and 

dVp = T"^ {Vf^ + F{V,^,t)) dt + €nB{Vp, t) dWp. 

We conclude this section by presenting one particular choice of a diffusion coef- 
ficient and a Wiener process. We take {Wp)t>o to be a cylindrical Wiener process 
on L^(D) with covariance Q — Id^2. Then we can choose B(t) — jo (■Vg(0) ^ 
L(L^{D), H-"{D)), where j is the embedding operator L^iD) H-"(D) in the 
sense of (1.2) and {-y/git)) e L(L^(D), L^(D)) denotes a pointwise product of a 
function in L^(D), i.e., 

{4> ■ y/g(t))(x) ^(t>(x)(^r-^v(t, x) + ^~^f(^f^ yXt' y) + nt, x) 

We first investigate the operator G(v(t), t) o t~' and write it in more detail as the 
following composition of operators: 

G{v{t),t)or^^jo{-g{t))okor\ 

where k is the embedding operator H"(D) L^(D). Next, the Hilbert adjoint 
B* e L{H-", L^) is given by B* — i-^) ok o l ' , which is easy to verify. Hence, 
the stochastic integral of B{t) with respect to VK^ is again a version of the limiting 
martingale as 

B{t) O Q O B*{t) = j O (-7^) O Idi2 O [-yfiii)) ok or' 

^jo{-g{,t))okoi-'^G{v{t),t)or\ 

3 Discussion and Extensions 

In this article, we have presented limit theorems that connect finite, discrete micro- 
scopic models of neural activity to the Wilson-Cowan neural field equation. The 
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results state qualitative connections between the models formulated as precise prob- 
abilistic convergence concepts. Thus, the results strengthen the connection derived in 
a heuristic way from the van Kampen system size expansion. 

A general limitation of mathematically precise approaches to approximations, cf. 
also the propagation to chaos limit theorems in [30], is that the microscopic models 
are usually defined via the limit. In other words, the limit has to be known a priori, and 
we look for models which converge to this limit. Thus, in contrast to the van Kampen 
system size expansion, the presented results are not a step-by-step modelling pro- 
cedure in the sense that, via a constructive limiting procedure, a microscopic model 
yields a deterministic or stochastic approximation. Hence, it might be objected that 
the presented method can only be used a posteriori in order to justify a macroscopic 
model from a constructed microscopic model and that somehow one has to 'guess' 
the correct limit in advance. Several remarks can be made to answer this objection. 

First, this observation is certainly true, but not necessarily a drawback. On the con- 
trary, when both microscopic and macroscopic models are available, then it is rather 
important to know how these are connected and qualitatively and quantitatively char- 
acterise this connection. Concerning neural field models, this precise connection was 
simply not available so far for the well-established Wilson-Cowan model. Further- 
more, when starting from a stochastic microscopic description working through prov- 
ing the conditions for convergence for given microscopic models, one obtains very 
strong hints on the structure of a possible deterministic limit. Therefore, our results 
can also ease the procedure of 'guessing the correct limit' . 

Secondly, often a phenomenological, deterministic model, which is an approxima- 
tion to an inherently probabilistic process is derived from ad-hoc heuristic arguments. 
Given that the model has proved useful, one often aims to derive a justification from 
first principles and/or a stochastic version, which keeps the features of the determin- 
istic model, but also accounts for the formerly neglected fluctuations. A standard, 
though somewhat simple approach to obtain stochastic versions consists of adding 
(small) noise to the deterministic equations. This article, provides a second approach 
which consists of finding microscopic models, which converge to the deterministic 
limit to obtain a stochastic correction via a central limit argument. 

Thirdly and finally, the method also provides an argument for new equations, 
i.e., the Langevin and linear noise approximations, which can be used to study the 
stochastic fluctuations in the model. Furthermore, in contrast to previous studies, we 
do not provide deterministic moment equations but stochastic processes, which can 
be, e.g., via Monte Carlo simulations, studied concerning a large number of pathwise 
properties and dynamics beyond first and second moments. 

We now conclude this article commenting on the feasibility of our approach con- 
necting microscopic Markov models to deterministic macroscopic equations when 
dealing with different master equation formulations that appear in the literature. Ad- 
ditionally, the following discussions also relate the model (1.6) considered in this ar- 
ticle to other master equation formulations. We conjecture that the analogous results 
as presented for the Wilson-Cowan equation (1.3) in Sect. 2 also hold for these vari- 
ations of the master equations. This should be possible to achieve by an adaptation of 
the methods of proof presented although we have not performed the computations in 
detail. 
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3.1 A Variation of the Master Equation Formulation 

A first variation of the discrete model we discussed in Sect. 1.2 was considered in the 
articles [8, 9] and a version restricted to a bounded state space also appears in [31]. 
This model consists of the master equation stated below in (3.2), which closely re- 
sembles (1.6). In the earlier reference [8], the model was introduced with a different 
interpretation called the effective spike model. We briefly explain this interpretation 
before presenting the master equation. Instead of interpreting P as the number of 
neuron populations, in this model, P denotes the number of different neurons in the 
network located within a spatial domain D. Then 0^, the state of the A:th neuron, 
counts the number of 'effective' spikes this neuron has emitted in the past up till 
time t. Effective spikes are those spikes that still influence the dynamics of the sys- 
tem, e.g., via a post-synaptic potential. Then state transitions adding/subtracting one 
effective spike for the A:th neuron are governed by a firing rate function /jt, which 
depends on the input into neuron k, and a decay rate r~'. The constant decay rate 
indicates that emitted spikes are effective for a time interval of length r and the gain 
function is defined — neglecting external input — by 



where /* is a certain non-negative, real function. It is stated clearly in [9] that the 
function /* is not equal to the gain function / in the proposed limiting Wilson- 
Cowan equation (1.3), but rather connected to / such that 



The authors in [9] state that for any function / such a function /* can be found. 
Then the process ©, — (0} , . . . , ©f) is a jump Markov process given by the master 
equation 



with boundary conditions P[0, f] = 0 if 0 ^ Nq as stated in [9]. The advantage of the 
effective spike model interpretation over the interpretation as neurons per population 
is that the unbounded state space of the model is justified. In principle, there can be an 
arbitrary number of spikes emitted in the past still active. However, a disadvantage of 
the master equation (3.2) is that for taking the limit it lacks a parameter corresponding 
to the system size providing a natural small parameter in the van Kampen system size 
expansion. This explains the shift in the interpretation of the master equation in the 




E/* I ^ Wkj 0/ = / ( X! ^Ki^^i + higher order terms. (3.1) 
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Study [9] following [8], and subsequently in [5] to the interpretation we presented in 
Sect. 1.2, which provides the system-size parameters [(k). 

On the level of Markov jump processes, the master equation (3.2) obviously de- 
scribes dynamics similar to the master equation (1.6) only replacing the activation 
rate r~'Z(A:)/^(0) in (1.6) by fk(d) which is independent of the parameter l{k). 
Thus, the model (3.2) can be understood as resulting from (1.6) after a limit pro- 
cedure taking l(k) ^ oo has been applied and the firing rate functions are connected 
via the formal limit \imi(i()^oo l{k)fii{0) — fk{0). A qualitative interpretation of this 
limit procedure connecting the two types of models is given in [8]. This observation 
motivated the model in [5] stepping back one limit procedure, and thus providing the 
correct framework for the derivation of limit theorems. 

It would be an interesting addition to the limit theorems in Theorem 2.1 to de- 
rive a law of large numbers for the models (3.2) with stochastic mean activity u" 
as defined in (2.7) and suitable chosen weights Wkj- Clearly, the macroscopic limit 
should be given by the Wilson-Cowan equation (1.3). We conjecture that the appro- 
priate condition for the function /* in the present setting — including time dependent 
inputs — is 



E 



(J^—n E0/ 



l(j,n) 



(3.3) 



such that the higher order terms are uniformly bounded and vanish in the limit 
n — >• oo, and where the weights Wj^j and inputs Ik,nit) are defined as in (2.4) and 
(2.6). Property (3.3) closely resembles condition (3.1) and trivially holds for linear f 
with /* = /. 

3.2 Bounded State Space Master Equations 

We have already stated when introducing the microscopic model in Sect. 1.2 that the 
interpretation of the parameter l(k) as the number of neurons in the A:th population 
is not literally correct. The state space of the process is unbounded, hence arbitrarily 
many neurons can be active, and thus each population contains arbitrarily many neu- 
rons. In order to overcome this interpretation problem, it was supposed to consider the 
master equation only on a bounded state space. That is, the ^th population consists of 
l{k) neurons, and 0 < 0,*^ < l(k) almost surely. Such master equations are simply ob- 
tained by setting the transition rates for transition of 6^ from l{k) — /(^) + 1 to zero. 

A first master equation of this form was considered in [22], which in present no- 
tation, takes the form 



dP[6», r] 
~df 



\)fk(.0-ekW{e''-ek,t] 



{qk + (/(^) _ e'')f,(e))ne, t] + [e'' + \)ne + ek, t]\. (3.4) 
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Versions of such a master equation for, e.g., one population only or coupled inhibitory 
and excitatory populations were considered in [3, 22], and a van Kampen systems 
size expansion was carried out. Here, the bound in the state space provides a natural 
parameter for the re-scaling, thus a small parameter for the expansion. The setup of 
this problem resembles closely the structure of excitable membranes for which limits 
have been obtained with the present technique by one of the present author and co- 
workers in [27]. Therefore, we conjecture that our limit theorems also apply to this 
setting with minor adaptations with essentially the same conditions and results as in 
Sect. 2. However, the macroscopic limit, which will be obtained does not conform 
with the Wilson-Cowan equation but will be given by 

rv(t,x) — -v(t,x) + [l - v(t,x))f^j w(x,y)v(t,y)dy + I(t,x)j. (3.5) 

Next, we return to the master equation (1.6) as discussed in this article in Sect. 1.2 
and the comment we made regarding bounded state spaces the footnote on page 7. In 
our primary reference for this model [5], actually a bounded state space version of the 
master equation was considered where the activation rate for the event O'' — >• 0*^ + 1 
is 

l{k)7,(0,t)\e,<mh (3-6) 

replacing l(k)fj^{9,t) in (1.6). The van Kampen system size expansion was then 
applied to this bounded state space master equation, tacitly neglecting possible diffi- 
culties, which might arise due to the discontinuity of (3.6) considered as a function 
on M^. However, for the present, mathematically precise limit convergence results 
considering bounded state space as originally suggested in [5] are problematic. The 
discontinuous activation rate (3.6) causes the machinery developed in [27], which 
depends on Lipschitz-type estimates to break down. However, we strongly expect 
that also in this case the law of large numbers with the deterministic limit given by 
the Wilson-Cowan equation (1.3) holds. Furthermore, also the Langevin approxima- 
tions should agree with the equations discussed in Sect. 2.3. However, we have not 
yet been able to prove such a theorem. We further conjecture that the results in this 
article can be used to prove the convergence for the bounded state space model by 
a domination argument. Heuristically, it seems clear that a bounded process should 
be dominated by a process that possesses the same dynamics inside the state space 
of the bounded process, but can stray out from that bounded domain. Hence, as the 
limit of the potentially larger process lies within the domain where the two processes 
agree also the dominated process should converge to the same limit. Mathematically, 
this line of argument relies on non-trivial estimates between occupation measures of 
high-dimensional Markov processes. This is work in progress. 

3.3 Activity Based Neural Field Model 

Finally, we return also to a difference in neural field theory mentioned in the begin- 
ning. In contrast to rate-based neural field models of the Wilson-Cowan type (1.1), 
there exists a second essential class of neural field models, so-called activity based 
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models, the prototype of which is the Amari equation 

Tvit,x) ^ -v(t,x) + / w{x,y)f{v{t,y))dy + I{t,x). (3.7) 
Jd 

We conjecture that also for this type of equations a phenomenological microscopic 
model can be constructed with a suitable adaptation of the activation rates and that 
limit theorems analogous to the results in Sect. 2.1 hold. Then also a Langevin equa- 
tion for this model can be obtained and used for further analysis. 



4 Proofs of the Main Results 

In this section, we present the proofs of the limit theorems. For the convenience of 
the reader, as it is important tool in the subsequent proofs, we first state the Poincare 
inequality. Let D C M'' be a convex domain, then it holds for any function (p e H\D) 
that 

diam(£)) 

Hd - 0IIl2(Z)) < \\^<P\\lHd), (4.1) 

where 4>d mean value of the function 0 on the domain D, i.e., 

^D^T^.I (t>i.x)dx. (4.2) 

Moreover, the constant in the right-hand side of (4.1) is the optimal constant depend- 
ing only on the diameter of the domain D,cf. [1, 23]. Whenever we omit to denote the 
spatial domain for definition of norms or inner products in L?{D) or Sobolev spaces 
H" (D), then it is to be interpreted as the norm over the whole domain D. If the norm 
is taken only over a subset Dk,,,, then this is always indicated unexceptionally. 
For the benefit of the reader, we next repeat the limiting equation 

rv{t,x) = -v(t,x) + f^j w(x,y)v(t,y)dy + I(t,x)j. (4.3) 

We denote by F the Nemytzkii operator on L^(D) defined by 

F(g,t)ix) = fQ^w(x,y)g{y)dy + I{t,x)^ VgeL^iD), (4.4) 

and for all 6 e Nq we define a discrete version of the Nemytzkii operator via 

+ 1f(v"(0), t) = r (0, t)j^y(^) - V(e))i^'-i(0, t), d?) 
1^1 

k=l ^ ' ' 

1 1 ^ _ 

= — v"(0)+-^/,,„(0,f)IlD,,„- (4-5) 
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Note that r~\(l), v'\e))i^2 + r~H^, f"(v"(0), t))i^2 for 0 e L^{D) corresponds to 
the generator of (0f , Or>o applied to the function {6, t) {(f>, v"((9))^2. 

Finally, another useful property is that the means of the process' components are 
bounded. For each k, n it holds that 

1 — 
E0f = E0^'" + - / l(k, «)E/^ „ (f;) - ds 
t Jo 

1 f' 

<E0^'" + - / Z(A:,n)||/||o-E6)^"d^, 

see also (B.l). Therefore, it holds that E0f < mf'", where mf'" solves the deter- 
ministic initial value problem 

• k,n 1 k,n . ^7/7 Mini t," Ttn/^*^," 

—--m, + -l(k,n)\\J\\o, OTq = Efe^Q , 

i.e., 

mf'« = e-'/^(mO „ - «)||/||o) + /(^, «)||/||o 

</(^,«)(l+||/||o) Vf>0. (4.6) 

Here, we also used the assumption E"0q " < l(k, n) on the initial condition. 

4. 1 Proof of Theorem 2. 1 (Law of Large Numbers) 

In order to prove the law of large numbers. Theorem 2.1, we apply the law of large 
numbers for Hilbert space valued PDMPs, see Theorem 4. 1 in [27], to the sequence of 
homogeneous PDMPs {Y")t>Q — {0", f)r>o- For the application of this theorem, re- 
call that the first, piecewise constant, vector-valued component of this process counts 
the number of active neurons in each sub-population and the second, deterministic 
component states time. The process (F/')r>o is the usual 'space-time process', i.e., 
homogeneous Markov process which is obtained via a state-space extension to ob- 
tain a homogeneous Markov process from the inhomogeneous process (0")t>o - The 
continuous component satisfies the simple ODE r = 1, f(0) = 0, and thus the full 
process is a PDMP. In the terminology of [27], the sequence of coordinate functions 
on the different state spaces of the PDMPs {Y")t>o into a common Hilbert space is 
given by the maps v" (2.7) with the common Hilbert space L^(D). Thus, in order to 
infer convergence in probability (2.10) from Theorem 4.1 in [27], it is sufficient to 
validate the following conditions: 

(LLNl) For fixed T > 0, it holds that 

lim E" f ||v«(?)-v«((9r)||j2/x"(J',",dt)df = 0. (4.7) 

(LLN2) The Nemytzkii operator F satisfies a Lipschitz condition in L^(D) uni- 
formly with respect to f, f > 0, i.e., there exists a constant Lq > 0 such 
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that 



\\F{gut)-F{g2,t)\\^2<Lti\\gi-g2\\L2 Vf >0,gi,g2ei'(£>). (4.8) 
(LLN3) For fixed T > 0, it holds that 

Hm E" /" \\F"{v"t)-F{v"t)\\,^dt^Q. (4.9) 

Note that the final condition of Theorem 4.1 in [27], i.e., the convergence of the initial 
conditions, is satisfied by assumption. For a discussion of these conditions, we refer 
to [27] and proceed to their derivation for the present model in the subsequent parts 
(a) to (c). 

(a) In order to prove condition (4.7), we write the integral with respect to the discrete 
probability measure /Lt" as a sum. This yields 

E«A(f;) ( ||v«(^)-v"(0f)f^2M"(J',",df) 

= 7 E J^2 i®'" + «)7m (Yn) I Dt.n I 



1 1 + 211 fllo 
< - \D\ 



(4.10) 



where we have used the upper bound (4.6) on the expectation E"0*'" and the 
assumption on the initial conditions. Next, integrating over [0, T] and employing 
the assumption lim„^oo ^-(«) = oo in (2.9) establishes condition (4.7). 
(b) The Lipschitz condition (4.8) of the Nemytzkii operators is a straightforward 
consequence of the Lipschitz continuity ( 1 .4) of the gain function / as 

\\Figi,t)-Fig2,t)\\l2 
= jj^f(^j^wix,y)gdy)dy + l(x,t) 

~ f(^j ^(X'y)g2(y)dy + I(x, 

<LM / w(x,y){gi(y)-g2iy))dy 
Jd Jd 

<L^ \\w(x,-)\\]^2\\8i - gifridx 
Jd 

^L^\Ml2^L2\\8i-82\\l2- 
Therefore, (4.8) holds with Lipschitz constant Lq := L||w;||^2xl2 
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(c) Finally, we prove the convergence of the generators (4.9). To this end, we employ 
the characterisation of the norm in L?(D) by ||?7||^2 = supn^n^^.j |(0, ??)^2| for 

all T] e L^{D), and thus consider first the scalar product of elements </> e L^{D) 
with ||(/>||^2 = 1 and the difference inside the norm in (4.9). On the one hand, we 
obtain using definition (4.5) that 



(4.11) 



Next, we apply the Nemytzkii operator F defined in (4.4) to v"(f) and take the 
inner product of the result with respect to (p to obtain on the other hand 

m-.')),, = (*. /(e .v)d. + /(,, )))^^^ (4.12, 

Subtracting (4.12) from (4. 11), we obtain the integrated difference 
(0,F"K,r)),,-(0,FK,r)),, 



.k=l 



f' 



> — ^— / w{x,y)dy + I(t,x) 



dx 



^ f 

I ^ 

-f 



<Pix) 



^ or f , 



l(j,n) 
,y)dy + I(t,x) 



dx. 



We proceed to estimate the norm of the term in the right-hand side. We use the 
Lipschitz condition (1.4) on /, the triangle inequality, and finally the Cauchy- 
Schwarz inequality on the resulting second term to obtain the estimate 

\{4>,r{v';,t))^,-{4>,F{v';,t))^,\ 

<lV/ \(I}(x)\ 



'»•](•)(-„ [ \ - 



dx 
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P an 



< L 



(*) 



(**) 

Here, the term in the right-hand side marked (**) is further estimated using the 
Cauchy-Schwarz inequality and the Poincare inequality (4.1), which yields 

1/2 



V,/(r)||^,. (4.13) 



We now consider the term marked (*). Inserting the definition of Wj^j given in 
(2.4), the reordering of the summations and changing the order of integration 
yields 



^ @ ''" /" I / 1 f 



w(x, y)dy 



dx 



X 



p p 



w(z, y)dz] — w(x, y) 



dy dx 



=Ei: 

k=i j=i 



0{ 



l{k,n) 



f [f I* 



J—/" w{z,y)dz] - w{x,y) 

\L>k.n \ JDt,n 



dx 



dy. 



We next apply the Cauchy-Schwarz inequality to the integral inside the square 
brackets in the last term. Thus, we obtain the estimate 



p p 

; = K^J> k=\ 

^ \l ( T7S~1 / w(z,y)dz 



w(x, y) 



dx 



1/2 



dy. 
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Now the Poincare inequality (4.1) is applied to the innermost integral inside the 
square brackets, which yields 

j=l ^ ' ^ "^"J-" k=l 

Finally, using once more the Cauchy-Schwarz inequality on the innermost sum- 
mation we obtain 

W^^^EtTtH/ ^2dy. (4.14) 

Now, a combination of the estimates (4.13) and (4.14) on the terms (*) and 
(**) yields 

\{cP,r{v^,t))^,-i<P,F{v"(t),t))^,\ 
^^+(«)^(£:|^4j|v.-(.y)IL.d,+ ||v./(r)||,| 

Here, the right-hand side is independent of </>, hence taking the supremum over 
all 0 with 11011^2 — 1 yields 

||F"K,r)-F(v"(r),OIL. 



p 



<hW-ii2-^H [ l|v.»'(-,y)|L2dy+||v,/(r)||^A 
Finally, integrating over (0,7") and taking the expectation on both sides results in 

E" r\\F"{v';,t)-F{v"it),t)\\^,dt 
Jo 

< S+(n)^{^\T{l + ||/||o)l|V,u;||i2^i2 + Wy^Ih^iOJiL^))- 

(4.15) 

Here, we have used (4.6) and a combination of the Cauchy-Schwarz and Poincare 
inequality (4.1) in order to estimate 



^"EIS / II V.H^O. y)|L2 dy < S^in) ^^^' + 11^11°^ ||V...«;||,2.,2. 



The upper bound in (4.15) is of order 0(S^(n)) and, therefore, converges to zero 
for « — oo due to assumption (2.9). Hence, condition (4.9) is satisfied. The proof 
of the convergence in probability (2.10) is completed. 
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It is now easy to extend this result to the convergence in the rth mean. First 
of all, the convergence in probability (2.10) implies for all r > 1 the convergence 
in probability of the random variables sup^g^Q 7"] \Wt ~ ^(01122 to zero. As conver- 
gence in the mean of real valued random variables is equivalent to convergence in 
probability and uniform integrability it remains to prove the latter for the families 
sup,g[o,r] ||vf -u(f)|l22.«eN. 

We first consider the case r = 1 , and establish a uniform bound on the second mo- 
ments E" sup,g[Q j-j II y" — y(f)||^2- Then the de la Vallee-Poussin theorem, cf. App., 
Proposition 2.2 in [15], implies that the random variables sup,g[o r] ll^^" " v{t)\\i2, 
n e N, are uniformly integrable. 

Without loss of generality, we can assume that there exist^ Poisson processes 
(Nf'")t>o with rates Aic^„ — l(k,n)(l + ||/||o)/t, which dominate (0^'" — 0q'")i>o 
pathwise. Then we obtain almost surely 

iv, i,2<2|vo|,, + 2^^ |Z),,„|<2|vo|,,+2^^^^|D,,„|. 
Here, the right-hand side is independent of f < T, and thus we obtain 

E« sup < 22<2E" vg 2,+2V ; |gMl<2E" Vp" + 

where we have used that nI^'" is Poisson distributed with rate TAic „, and thus 
E"(A^r'")^ = T^k,n + T^Al n- Here, Ct is some finite constant which depends on T 
and the overall parameters of the model, i.e., r, /, D, but is independent of k and n. 
Using this upper bound, the triangle inequality yields the estimate 

E" sup^J|v« - v{t)\\, < 2Cr(E"||vS||i2 + ll'^llcaan.L^) + !)• 
Therefore, using the assumption sup„gpj E" || || < it holds that 

II II 2 

supE" sup II v" — v(f)||^2 < oo- 

nsN r€[0,r] 

The general case for r > 1 works analogously. Note that the rth moment of the 
Poisson distribution is proportional to the rth power of its rate. Hence, just as in the 
case of r = 1 , the term 



^The Poisson process jumps at a faster rate than the components of the Markov chain regardless of the 
time and the state these are in. Furthermore, all jumps are upward. Hence, using a coupling argument as 
discussed in the proof of Theorem 4.3.5 in [16], we find that there exists a probability space supporting 
two processes with distributions equivalent to the Poisson process and the Markov chain component such 
that the Poisson process dominates the second process for all paths. Clearly, all moments dominate and 
these inequalities are then valid for any probability spaces supporting these processes. 
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can thus be bounded from above by some constant Ct independent of k and n. The 
proof of Theorem 2. 1 is completed. 

4.2 Proof of Corollary 2.1 (Corollary to the Law of Large Numbers) 

For a = 0, the statement of the corollary coincides with the statement of Theorem 2.1, 
hence we consider a > 0. As in the proof of Theorem 2.1, we apply Theorem 4.1 in 
[27] to the PDMPs (F")r>o, however, this time for the functions v" understood as tak- 
ing values in the Hilbert space H~"(D) instead of L^(D). Thus, we have to validate 
again conditions (LLN1)-(LLN3) wherein the norm in L?{D) is always replaced by 
the norm in H~"(D). The essential argument is sharpening the estimates in part (a) 
of the proof of Theorem 2. 1 using optimal Sobolev embedding theorems such that the 
conditions (2.13) imply (LLNl). This we present in part (a) of the proof below. The 
Lipschitz condition (LLN2) of the Nemytzkii operator F in the spaces H~" is estab- 
lished in part (b). Finally, as the condition &+{n) -> oo remains as in Theorem 2.1, 
the condition (LLN3) follows immediately from the proof of Theorem 2. 1 due to the 
continuous embedding of L^(D) into H~"(D). 

(a) In the case a = 0, i.e., H"{D) = L^iD), we used in (4.10) that HId^ JI^j = 
\Dk^n\ - For general a > 0, we use the representation 

W^DtjH-" = sup |(</>,Id,,„)z,2|. 

Il^llw 

In order to estimate the terms inside the supremum in the right-hand side, we use 
Holder's inequality and the Sobolev embedding theorem, i.e., H" (D) ^ L°° (D) 
for a > d/2 and H"(D) ^ U {D) with r = d/(d/2 - a) for 0 < a < d/2, see 
Theorem 7.34 and Corollary 7.17 in [2]. Thus, we obtain 



Kd/(d/2-a) l|Iz)t,„ II Lldl{d+2a) if 0 < a < d/1, 

^ooII%.„IIli iid/2<a, 



where the constants K are the constants arising from the continuous embeddings 
of the Sobolev spaces into the Lebesgue spaces. Evaluating the norms in the 
right-hand side, and further estimating using the maximal Lebesgue measure of 
the elements of the partition yields 

P ||2 AK},(dl2-.)\DKn\v+inf"" if0<«<^/2, 
DKn i/-" - j I I („) iid/2<a. 

Note that the upper bounds are consistent with the condition in Theorem 2.1 
for a = 0. Finally, as H''I'^{D) ^ H^''I^-^\D) for all small e, the result for 
a=dl2 follows from the result above as 

||Id,,„ Wh-h^ < sup {Hh"/- I|IId,,„ hd/id-,)) < C||%,„ hd/id-,) 
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where C is the constant resulting from the continuous embedding of H'^/^{D) 
into //^/2-<^(£)).Thus, we obtain for all e > 0 the estimate 

\\Id,J]j-„2 < C^\Dk,n\v+{ni'^-^^"^ . 

(b) Next, we have to establish that the Nemytzkii operator F on L^iD) is also Lip- 
schitz continuous with respect to the norms || • || }j-a , a > 0, i.e., for all a > 0 
there exists a constant L_„ such that 

II F{gi,t) -F{g2, t)\\ < L-o,\\gl - SiWh-" V? >0,gi,g2€ L^{D). 

(4.16) 

We obtain due to the Lipschitz continuity of /, which implies absolute continuity 
of /, that 

If (t>{x){F{gi,t){x)-F{g2,t){x))dix\ = \j 4>{x) r^''""^ f'{z)dzdx , 

\JD \ \Jd Jzi(t,x) 

where 

Ziit,x)= wix,y)gi{y)dy + Iit,x), 
Jd 

Z2{t,x)= / w{x,y)g2(y)dy + I{t,x). 

JD 

Applying Holder's inequality and the essential boundedness of the derivative 
we obtain the estimate 

\l 4>{x){F{gut){x)-F{g2,t){x))Ax 
\Jd 

\ f'iz)dz 

d\Jzi(i,x) 

<\mLp(^jj\\f'\\Loo{ziit,x)-Z2{t,x))\''dx^ 

= mLp\\f'\\L'^(^JjJ^wix,y){giiy) - giiy))dy 
Next, as by assumption wix,-) e H"{D), we obtain 

= (^^||w(x,-)||^„|(u)(x,-)/||w;(x,-)||^<,,gi -g2 



? \ 1/9 
dx\ 



\ 1/? 



9 \ 1/9 

dx 



9 \ 1/9 

dx 



1/9 



\w\\L9xH''\\gl- g2\\H-<'- 
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Overall, this yields the estimate 

1(0, Figut) - Fig2,t))u-c \ < \\(p\\Lp\\f'\\i^oc\\w\\L1y<H'='\\gl - SiWh-'- 

Hence, taking the supremum on both sides of this inequality over all = 1, 

we obtain the Lipschitz condition (4.16) with L-a := LKa WwHigxH" , where Ka 
is the constant resulting from the continuous embedding of H"(D) into LP{D) 
and the Lipschitz constant L of f satisfies L > || /' || L°° ■ 

4.3 Proof of Theorem 2.2 (Infinite Time Convergence) 

(a) We first present an alternative representation for the jump processes (0")t>o and 
the solution v of the Wilson-Cowan equation (1.3). Using the generator of the 
PDMP {&", t)t>o, we obtain that the components 0*'" satisfy 

= f'x"{0^,s) f (?'^-(9^'>«((9«,^;d§)d^ + Mf-" 

Jo Jnp 

= 00-" + /J (-7®''" + 7'^^' «)7mK' ^)) + Mf'", (4.17) 

where (M^ '"),>o is a square-integrable cadlag martingale given by 

Mf-" := 0f'" - 0^-" - [' X"{0l',s) [ (^^- - 6)f-")/x"((9;, s; d§) d^. (4.18) 
Jo J^P 

As the jump process is regular, this martingale is almost surely of finite varia- 
tion and it could also be written in terms of a stochastic integral with respect to 
the associated martingale measure of the PDMP [16]. Next, interpreting 0^-" as 
the solution of the stochastic evolution equation (4.17) driven by the martingale 
M^'", it follows from the variation of constants formula that it satisfies 

^ Jo 

+ Te-^'-^^/MMf'". (4.19) 
Jo 

This formula can also be easily verified path-by-path by inserting (4.19) into 
(4.17) and using integration by parts. Note that here the stochastic integral with 
respect to the martingale is just a Riemann-Stieltjes integral as the martingale is 
of finite variation. For the sake of completeness, we briefly sketch the arguments. 
Thus, inserting (4.19) into (4.17) yields 

6)f = 6)^-'" - - [' e-'/'0^'"ds 
^ Jo 

(*) 

1 

- -l(k,n) 

X 
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^ ^0 ^0 Jq y 



(**) 



■- [' ['\-^'-'-^^' dM^-" ds + Mf'" . 
^ Jo Jq 



(***) 

Considering the three terms marked (*)-(* * *) separately, we show that this 
right-hand side equals (4. 19). For the first term (*) simply evaluating the integral 
yields 

t Jo' 

which gives the first term in the right-hand side of (4.19). Next, we simplify the 
term (**) employing integration by parts to the first term in (**), which yields 

1 r re-(-'-)/V,,„(0;,r)drd. 
^ Jo Jo 

f\-('-'-y^f,„{0^,r)drds 
t: Jo Jo 

JO Jo 

= _ /"e-('-)A7^ J0«,,)d.+ f'j,„{0^,s)ds. 
Jo Jo 



1' .re^'-s)/r 



r Jo 



Thus, we obtain subtracting from this right-hand side the second term in (**) 
that 



(**)=- f e-^'-''^'fw^">':,s)ds. 

Jo 



This term is just the second term in the right-hand side of (4.19). It remains to 
consider the term marked (***). We have already stated that the stochastic inte- 
gral with respect to the martingale (4.18) is defined path-by-path as a Riemann- 
Stieltjes integral, and thus satisfies 



-- /"'e-('-'-)/MM*'" 
r Jo 

= -i^e-(^-^"'/^((9f;"-(9^*„") 



r 

t'J<s 
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[ (|*_0^^.«)/,«((6,;,,),dt)d5, (4.20) 



X 



where r" denotes the yth jump time of the rath PDMP. Integrating the sum in this 
right-hand side over (0, t) yields 

^ J° rj<s 

= J2 e-^'-^")/^(0^V - 0'/) - J2 (fi*'"" - 0'"") 

Tl<t ' ' T'l<t ' ' 

= ^ e-('-^">/^(0^V - 0,V) - (of-" - 0o''")- 

Next, we apply integration by parts to the integral over (0, t) of the second term 
above analogously to the application to term (**), and obtain 

- f (\-"^'-'-^X"{@"^,r) [ (§'^-0,^'«)/z"((0;,r),d^)drds 
^ Jo Jo Jn^ 

Jo Jn(I 

+ f'x"{0:,s) f {e-0^'")ti"{{0:,s),d^)ds. 

Jo Jn^I 
Hence, overall these considerations show that 

(***)= [' e-C-')/^ dMf 
Jo 

and we obtain the final, third term in the right-hand side of (4. 19). This completes 
the proof that (4.19) solves Eq. (4.17). 

Further, we obtain from the variation of constants formula for also a 
representation for the stochastic mean activity v" by inserting (4.19) into its def- 
inition (2.7). This gives 
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^ Jo 
''if' 



(4.21) 



Finally, in order to compare stochastic and deterministic solutions we use that 
the solution of the Wilson-Cowan equation can also be given via the variation of 
constants formula, i.e., it holds that for all f > 0 

1 r' 

v(t) = e-'^''v(0)+ - e-'^'-'^^^ F{vis),s)ds. (4.22) 
T Jq 

Thus, subtracting (4.22) from (4.21), and taking the expectation of the norm in 
H~" (D) yields the estimate 

E"||y(f)-<||^_„ 

= e-'AE«||v(0)-v^'||^_„ 

+ 1 r e-C-^y^E" II F{vis), s) - f"{v':,s) II „„„ 
^ Jo 



■E" 



^ I ft 
^^l(k,n)Jo 



e-('--0/r dM*'"Iz)^,„ 



(4.23) 



We deal with the terms in the right-hand side of (4.23) separately in the following 
such that we can apply Gronwall's inequality. Note that the term containing the 
initial condition vanishes due to the assumptions of the theorem. We start with 
the stochastic integrals in the subsequent part (b) of the proof, 
(b) As due to Jensen's inequality E|y| < y^EIFp, it makes sense to calculate the 
second moment of the stochastic integral in the right-hand side. For the norm in 
, we use ||(/)|| rj-a — ((^j </')/f^«5 ^nd thus obtain using the linearity of the 
inner product 



^ 1 /"' 
^^l(k,n)Jo ' 



■E 

Mi 



l(k, n)l(j, n) 



(IlDt.„jDj.„)if- 



We next consider the individual expectations of the random terms \Pk,n\ 
Pk,nPj,n in the right-hand side. We have already stated that the stochastic inte- 
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gral with respect to the martingale (4.18) is defined path-by-path as a Riemann- 
Stieltjes integral (see (4.20)) and, moreover, (4.20) implies that the stochastic 
convolution integral can be written as a stochastic integral with respect to the 
fundamental martingale measure M" associated with the PDMP (©", t)i>Q (see 
[16]), i.e., 

[' e-^'-^y-^ dM*-« = f ^-C-")!^ (^*^ _ (9^*^:_«)M" (d^, d^) 

JO i[0,r]xN^ 

with predictable integrand 

(^, s, w) ^ e-('-''/^(§^- - 0^"(«)). 

Then we obtain due to the Ito-isometry following from Proposition 4.6.2 in [16] 
using (4.6) that 

\ ft |2 

\Jo I 

= E" [' X" (y;) [ e-^c-^y' i^^ - V" (y's, d?) 
Jo Jn^ 

<E«^'e-2('-)/^Q(9^" + ^Z(fc, «)/.,„ (yf))d^ 

< l(E"6)^'-"+2/(yt,«)||/||o) /"'e-2('-')/M^. 
T Jo 

It remains to consider the product jSf for which we obtain due to the inte- 
gration by parts formula 

= [' /jfi" dpr + f Pi- dp^s'" + [/^''", /S''"],, (4.24) 
Jo Jo 

where the square brackets denote the quadratic variation process. The expectation 
of each of the terms in the right-hand side vanishes: The first two are stochastic 
integrals with respect to martingales, hence martingales themselves which are 
identical to zero at the origin. Furthermore, as both martingales are cadlag with 
paths of finite variation on compacts, hence quadratic pure jump martingales, we 
obtain for the quadratic variation process 

[p''\P'-"l = - - Pi-)- 

However, as all jump times of the two martingales a.s. differ it follows that 

{p''''\pi'"],^Q. 
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Thus, overall we have established that 
p 



E" 



f-;nt.n)Ja 



1^^ 1 + 211/110 ,,2 



2^ l(k,n) 



(4.25) 



where 1/2 is an upper bound for | /^e ^'/'^ di independent of t. Estimat- 
ing the norm ||Idj„ J^^*- proof of Corollary 2.1, we finally obtain 



that 



E" 



P 1 f' 
^/MJo 



Dk,„ 



H-" 



<^((l+2||/||o)|/)| 



v+jn)' 
£_(«) 



r\ 1/2 



with r = 2a/d for 0 < a < d/2, r — \ — € for a — d/2 and r = 1 for a > 
t//2. 

(c) We next estimate the term 

f e-('-'V-E« II F{v{s), s) - f" « , s) \\ 

^0 

in (4.23). From part (b) of the proof of Theorem 2.1 in Sect. 4.1, it follows 
that 

E«||F«,f)-^"K-OII//- 

<S+(n) {^\{ 1 + II / II o) II V, «; II i2 i2 + II V, / (r) II ^2) , (4.26) 

where F is the Nemytzkii operator defined in (4.4) and K-a is a constant 
resulting from the continuous embedding of L^(D) into H~"(D). Here, the 
right-hand side can be further estimated independently of r > 0 using the as- 
sumption that II Vv/(?)llz,2 is uniformly bounded in r > 0. Furthermore, we 
have shown in Sect. 4.2 in the proof of Corollary 2.1, that under the ap- 
propriate assumptions the Nemytzkii operator F is Lipschitz continuous on 
H~"{D), a > 0, with Lipschitz constant L-a > 0 independent of ? > 0, 
i.e., 

\\F(gi,t)-F(g2,t)\\fj-,<L_a\\8i-82\\H- ^gugieL^D). (4.27) 

A combination of the triangle inequality and the estimates (4.26) and (4.27) 
yields 

f\-^'-^y^E"\\F{v(s),s)-r{v':,s)\\^_^ 

Jo 

<L_„ f e-<'-^/^'E"||v(^)-<|| d^ + 0(3+(«)). 

^0 



^ Springer 



Journal of Mathematical Neuroscience (2013) 3:1 



Page 41 of 54 



Overall, it thus follows from (4.23) that 

E"||v(0-vr|lif-. <E"||v(0)-<||^_, 

+ — /"'e-('-^/^'E"||i;(i)-y;'|| „d5 
^ Jo 



+ OlS+(n)- 



lv+(ny 



Then an application of Gronwall's inequality yields 



\v+{ny 



xexp(^^£e-('-')/M^^ 

< (e" II v(0) - < II + O (s^(n) + yS^^y . 

By assumptions of the theorem, the term in the right-hand side converges to zero 
for n — ^ oo. As this convergence is uniform in t, it holds that 

lim supE" II - v" \\ = 0. (4.28) 

"^°°f>0 

4.4 Proof of Theorem 2.3 (Martingale Central Limit Theorem) 

In order to prove the martingale central limit theorem, we employ the general martin- 
gale central limit theorem (Theorem 5.1 in [27]) for the Hilbert space H~"(D), i.e., 
the dual of the Sobolev space H"(D), for a > c/. To apply this theorem, it suffices to 
prove the following conditions. Subsequently we use p„ — y/ l-{n)/v+{n) to denote 
the re-scaling sequence and use the notation 

{G"m,(P)^^=X{Y';) f (y«(|)-y"(0,"),0f^„M"(lT,dO (4.29) 

jni; 

which corresponds to the quadratic variation of the martingales (M'/)t>o, see [27] for 
a discussion. 

(CLTl) For all r > 0, it holds that 

supp„E« f'\x"{Y:) f \\v"(^)-v"{0':)\\i_^f^"{Y:\d^)ds 

neN Jo L 

< OO, (4.30) 
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and there exists an orthonormal basis (^/)/gN of H"(D) such that for all 
y e N and f > 0 

p„E«(G«(F;>^,^^)^„<y/C, (4.31) 

where the constants yj > 0 are independent of n and t, satisfy ^y>i // < 
oo, and the constant C > 0 is independent of n and k but may depend on t. 
(CLT2) The jump heights of the re-scaled martingales are almost surely uniformly 
bounded, i.e., there exists a constant p < oo such that it holds almost surely 
for all neN that 

sup^\\v"{0';) - v«((9,«_)||^_„ < p. (4.32) 

f>0 

Further, for all 0 e //" (D) and all f > 0 it holds that 

lirn^f E"|(G(v(^))(/.,0)^„-p„(G"(y;)0,0)^„|d. = O. (4.33) 

•/ 0 

On a technical level, we note that the condition (CLTl) guarantees tightness of the 
sequence of re-scaled martingales {yffhiM")t>Q in the Skorokhod space of cadlag 
functions in H~"{D). This property is equivalent to relative compactness in the 
topology of weak convergence of measures, and thus implies the existence of a con- 
vergent subsequence. The conditions (CLT2) are then sufficient to establish that any 
limit possesses the form of a diffusion process defined by the covariance operator C 
given in (2.18). In particular, condition (4.33) precisely gives the convergence of the 
quadratic variations and is thus the central condition. In the subsequent two parts of 
the proof, we show that they are satisfied: In part (a), we prove conditions (4.30) and 
(4.31), and part (b) establishes (4.32) and (4.33). 

(a) We first prove conditions (4.30) and (4.31). Here, we also observe the signifi- 
cance of the choice of the norm in H~"(D) with a > d for establishing the con- 
vergence, which is essentially that it guarantees the existence a Sobolev space 
H"^(D) with continuous embeddings H"{D) ^ H"i(D) ^ C(D), where the 
first is of Hilbert-Schmidt type. For subsequent use, we recall the estimates 

with a suitable constant Ka > 0, which we have already established in the proof 
of Corollary 2.1 due to the Holder inequality and the Sobolev embedding theo- 
rem. Therefore, we obtain for the term inside the expectation in (4.30) the esti- 
mate 

k" (f;) f \\v" (^) - V" {&':) II (y; , d^) 
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Next taking the expectation, using the bound (4.6) on E"©^ '" and integrating 
over [0, t] we obtain the estimate 

f'E"\x"{Y':) f \\v"(^)-v"{0:)\\l^^ix"{Y';,d^)\ds 
Jo I Jn^ 



^0 

< 



-/^„^(l + 2||/||o)^. 



Multiplying both sides with p„ = £-(n)/v+(n), we find that condition (4.30) is 
satisfied. 

We proceed to condition (4.31) and first of all expand the integrand to obtain 

{G"{Y::)cpj,cpj)^^^y'{Y^) f {v"(^)-v"{0:),4>j)i_^i,"{Y::,d^) 
1^1 

We next estimate the term {][oi,„,(p)\ia- Here, we use the fact that for a function in 
L^(D) its application as an element of the dual H~"(D) as well as H~"^ (D) for 
any ai with 0 < ai < a coincide. We choose ai such that d/2 < a[ < a — d/2 
and obtain 



where Kai is the constant resulting from the Sobolev embedding theorem. Next, 
taking the expectation, estimating the expectation terms as before and multiply- 
ing by p„ yields 

p„E«(G«(y;)^,-,^,-)^„ < ^<(1 + ||/||o)ll^,/ll^.i. 

We chose the constants in (4.31) as C :— K^^ (1 + ||/||o)/t and yj '■— WviW^ai ■ 
Finally, as due to Maurin's theorem the embedding of the space H"(D) into 
H"^{D) is of Hilbert-Schmidt type, cf. footnote 3 on page 16, it holds that 
S^/>1 II'?'/ II//"! < Condition (4.31) is satisfied, 
(b) The estimates in part (a) further show that the jump sizes are almost surely uni- 
formly bounded as 



sup>;;|| y" (©,") - v" (6);_) II < Ka 
r>0 



Here, the upperbound in the right-hand side converges to zero for « ^ oo, and 
thus the left-hand side is bounded over all n eN. Therefore, condition (4.32) 
holds and we are left to prove the convergence of the quadratic variation (4.33). 
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For the jump process, the quadratic variation satisfies 

1^1 

The quadratic variation of the limiting diffusion is given by 

= j (l)(xf(^^v(t,x) + ^f^j wix,y)v(t,y)dy + I(t,x)yjdx. 

Here, the necessary estimates are split into several parts which are separately 
considered in the following. Afterwards, the estimates are combined to infer the 
convergence (4.33). In the following, we use again F as the Nemytzkii operator 
defined in (4.4). Hence, for the difference of the quadratic variations, we obtain 
the estimate 

E" \{G{v(t), t)4>, cp)^, - Pn{G"(t)cP, I 



= -E" 

T 



/ (p(xfv(t,x)+(pixfF{vit),t)(x)dx 
Jd 



l{k,n) 



< -E" 
r 



/ (l>(xfv(t,x) + (l)(xfF(v(t),t)ix)dx 
Jd" ' > . ' 

(i) (li) 

/ ct>(xfv"{0';)ix) + ct>(xfF{v"{0^),t)ix)dx 

JD^ . ' ^ , ' 

(i) (ii) 

I <p{xfv"{0'^){x) + <p{xfF{v"{0';),t){x) dx 

JD' V ' > V ' 



(iii) 



(iv) 



k=\ ^> .. 



(iii) 

l(k,n)— , „, 9 



l{k,ny 



(4.34) 



(iv) 
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Using the triangle inequality once again for each of the two differences grouping 
the terms marked (i)-(iv) we obtain four terms, which we subsequently estimate 
separately. Finally, in part (v), we combine the four estimates, 
(i) The first term is the simplest to estimate. Using the Cauchy-Schwarz in- 
equality, we obtain 



Jd 



< ||</.||2,E"||v(r)-y,«||^2. (4.35) 



(ii) We next consider the difference arising from the terms marked (ii) and ob- 
tain using the Lipschitz condition (1.4) on / and the Cauchy-Schwarz in- 
equality twice 



" / ^ixf{F(v(t),t)ix)-F(v^,t)(x))dx 
\Jd 

<m" f |0(x)|' / w(x,y){v{t,y)-v';(y))dy 

JD JD 

<LE" / |0(jc)|^||w;(x, ■)||,2||i^(0 - <||r2dx 

JD 

<Lmi4w\\L2^i^2E"\\vit)-v';\\^2. 



dx 



(4.36) 



(iii) In order to estimate the next term, we use the bound (4.6) on E"©^'", and 
thus obtain 

^ /(^, n) Uz),,„ l(k,n)\JD,„ J 

<(l+ll/llo) 



(i + II/IIo)^|Z)mI 



y2\Dk,n\TT^[ ct>(xfdx-(-^ f cP(x)dx) 

fr{ \Dk,n\JDk,„ \\Dk,n\JDk,n / 

Y— i (p(x)d^ 

\\Dk,n\JD,J ) 



Pn\Dk,n\ 



l(k,n)\Dk,n\ 



(l+ll/llo)^ /" (<I^M-T^f cp(y)dy) dx 

uJDt,„\ \lJk,n\JDu,n / 

^ \\L>k,n\JDk,n 
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Then the estimate is completed applying the Poincare inequality (4.1) to the 
first term, that is, estimating 



ct>(x) 



If Y diam(D<:„) , 

h,n I Jd, ' 



\Dk,n\ JDk,n 



—ft 9 

and the observation that the second term is proportional to \\ j^2^ which 
is the piecewise constant approximation to (p based on the partition 
see (4.2). Therefore, we overall obtain an upper bound for the difference 
constituted by the terms (iii) in (4.34) by 



E" 



k=l 



l^i + (l+ll./llo)^(«). (4.37) 



In the last term. 



/?(«):= 



1 



V-{n) £-(n) 



v+(n) l+(n) 



I T" II 2 
II L2 



converges to zero for n — oo by assumption (2.19) and as the sequence 

— n 

11^2 is bounded as it converges to 11011^2 for n — oo. 
(iv) Finally, we consider the difference 



E" 



p 

f </)(x)2f«, t)(x)dx -p„J2 !iv^7k,n{Y")i^D,,„,<P)l 



<E«^ / 0(x)2f(v«,f)(x)dx--^/,,„(y,"){</,,Iz),,„>i 

We continue estimating the difference in each summand in the final right- 
hand side and obtain using the triangle inequality for the term inside the 
expectation 



E" J2 f <l>ixfF{vlt){x)dx - ^!L^/, „(F,"){I^^„,0>2^, 

<E«^ / </.(x)2(F(v«,f)(x)-/,,„(y;'))dx 



(*) 



k=i yJoii,, 



l(k,n) 



(=1=*) 
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We start with the first term and observe that it possesses the same structure 
as the term estimated in part (c) of the proof of Theorem 2. 1 with the only 
difference that here the function 0 in the integrand is squared. Therefore, 
we obtain the estimate, cf. (4.15), 



(*) < 3+(„)^^(^(l + ||/||o)||V,»;||i2^i2 + II V,/(0||^2). 



Next, we estimate the second term. Note that „ is bounded by ||/||o, and 
thus the remaining term is just as in part (iii) of the proof. Hence, we obtain 
the estimate, cf. (4.37), 



(**)<5+(«)2M^||0||2^, + ||/||o 



1 



V-{n) l-(n) 



v+(n) l+(n) 



I T" II 2 

I'/' IIl2- 



Therefore, we overall obtain an upper bound for the difference generated by 
the terms (iv) by 



E" 



f ^ 1 - 



<&+{n) ^^^^ (^( 1 + II / II o) II V, «; II ^2 ^ i2 + II V, / (0 II 



+ 3+(«)2^||0||2 +||/||o/;(„), 



(4.38) 



where the term R{n) is as in (4.37). 
(v) To complete the proof, we combine the estimates (4.35)-(4.38) to obtain 

E" |(G(i;(f), t)cp, ct>)^, - p„{G'\t)ct>, </.)^„ | 

< (1 + L\\wh2^L^)ml4E"\\v(t) - <||^2 + {I + 2\\ f Wo) R(n) 



+ S+{n)- 



L4 



(^(l + ||/||o)||V,w;||i2>,i2 + ||V,/(0||^2) 



^„ , ^2 l + 2||/llo „^„2 
+ S+(n) 2 Mini- 



Integrating over (0, T), we obtain with a suitable constant > 0 indepen- 
dent of n and T the estimate 

/ E"\{G{v(t), t)ct>, 0)^, - Pn{G"(t)4>, (P)h^ I dt 
Jo 

< Q(E"||y(f) - <||^,((o_^)^^2) + TRin) 

+ T8(n){l + II V.,/(f)||^,(((,_^)_^2)) +^(«)^). 
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The constant dp depends on the norm of cj) in the spaces H^(D) and L'^(D) 
where the latter can be estimated in terms of the norm in the Sobolev space 
H"iD) due to the embedding H"(D) ^ L'^(D), i.e., Q is finite and de- 
pends only on 0 e H"{D). Finally, each term in the right-hand side con- 
verges to zero for n oo, and hence condition (4.33) follows. The proof of 
Theorem 2.3 is completed. 
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Appendix A: Well-Posedness of the Wilson-Cowan Equation 

This section provides a concise exposition, based on classical existence theory, of 
the well-posedness of the Wilson-Cowan equation (1.3), and the boundedness and 
regularity results for its solution as referred to in Sect. 1.1. We understand Eq. (1.3) 
as an L^(Z)) -valued integral equation, i.e., 

1 f' 

v(t)^vo+- {-v{s) + F{vis),s))ds, t>0,voeL^iD), (A.l) 
^ Jo 

where the integral is a Bochner integral and F is the Nemytzkii operator acting on 
L^{D) defined by 

Fig, t)(x) = /(_^ ^(x, y)giy) dy + lit, x)^ Vg e L^{D). 

As in Sect. 1.1, we assume that / : M ^ K-(_ is Lipschitz continuous, w e L^(D x D) 
and / e C(]R+, L^(D)), which implies that F is continuous in t. Furthermore, it was 
shown in Sect. 4. 1 that under these assumptions F(g,t) is Lipschitz continuous in the 
argument g with Lipschitz constant independent of r > 0. Thus, the integrand in (A. 1) 
is Lipschitz continuous with respect to the L^(Z)) -valued argument for all ? > 0 and, 
moreover, uniformly continuous in g with respect to f . It follows that the integrand in 
(A.l), that is, the map (g, t) — >• —g + F{g, t), is jointly continuous on R-|_ x L^{D). 
Then Theorem 5.1.1 in [10] implies that there exists a unique, strongly continuous, 
global solution to (A.l) for every initial condition vq e L^(D). By definition, this 
solution is absolutely continuous, and as F is jointly continuous, the derivative of the 
solution is continuous and exists everywhere. Thus, we conclude that there exists a 
unique continuously differentiable solution, i.e., v e 
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Next, we recall an 'explicit' representation of the solution is the variation of con- 
stants formula (4.22), which we already stated in Sect. 4.3. We have that the solution 
of the Wilson-Cowan equation satisfies the integral equation 

v(f) = vo+/" Avit) + F{v(t),t)/rdt, 
Jq 

where A is the linear operator in L^(D) mapping g to —g/r. Thus, the solution y 
satisfies 

1 f' 

t: Jo 

In the present setting, the application of the linear operator e''^ corresponds to the 
scalar multiplication with as A = — :pld^2, and thus 

1 f' 

v(f) = e-'/'"i;o+ - / e~^'~'^^''F{s,vis))ds Vf > 0. 
T Jo 

We next discuss the results stated in Sect. 1 . 1 on the higher spatial regularity of 
solutions to (A.l). Then a pointwise bound on v(t) e L^(D), i.e., a constant C such 
that I v(f, x)| < C for almost all x e D and all f > 0, are then easily obtained by an 
approximation argument, that is, approximating the less regular solution by solutions 
of higher regularity. It is possible to prove the pointwise bounds directly; see e.g., 
[26] for such an argumentation in a similar setting. However, it is easier and more 
illustrative to use available results for solutions of higher spatial regularity, which are 
usually arising as the deterministic solution of (A.l) one is interested in. For example, 
the authors in [34] argue that from an application point of view it is reasonable to con- 
sider at least continuous solutions. In particular, the authors in [34] present a detailed 
existence and uniqueness result for the activity based Amari mean field equation and 
state that an analogous result hold for the Wilson-Cowan equation (A.l) for spatial 
dimensions d <3, which covers all physical relevant domains. Concerning the spatial 
regularity, they consider the space H"(D), where a is set to be the smallest integer 
such that a > d/2. The significance of the choice of a > d/2 is — as so often in this 
study — that this implies the embedding of the space H"(D) into C{D). Furthermore, 
we then even obtain that C([0, T], H"iD)) C C([0, T] x D), i.e., the solution y is 
jointly continuous. 

Therefore, we have the subsequent theorem, which is sufficient for the set-up in 
this study. However, we note that existence and uniqueness of solutions of the Amari 
equation were considered under less strict regularity assumptions on the coefficients 
in [24], and we conjecture that these are also valid for the Wilson-Cowan equation. 

Theorem A.l (Sect. 2 in [34]) The domain D is bounded and satisfies the strong 
local Lipschitz property. We assume that w e H"(D x D), that f e C" (D) with 
all derivatives bounded, and that I e C(K+, H"{D)). Then there exists a unique 
global solution v e C([0, T], H" (D)) for every T > 0 and every initial condition 
vq e H"{D) to (A.l), which depends continuously on the initial condition and is 
continuously differentiable. Moreover, the solution is globally bounded in H" (D) if 
the externally applied current I is globally bounded. 
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Remark A. 1 In the work [34], the authors assume for the domain only the cone prop- 
erty, which is impUed by the strong local Lipschitz property; see p. 84 in [2]. The 
latter is the necessary boundary regularity for the present study, cf. footnote 3 on 
page 16. Furthermore, in [34], it is also assumed that the gain function / is infinitely 
often differentiable with bounded derivatives, but it is surely sufficient for / being 
a-times continuously differentiable. 

Finally, it remains to show the pointwise bound v(t,x) e (0, ||/||o) if the initial 
condition satisfies vo(x) e [0, ||/||o] proposed in Sect. 1.1. Under Theorem A.l the 
solution v{t, x) to (A.l) is jointly continuous and, therefore, the Wilson-Cowan equa- 
tion holds pointwise in x everywhere and for all t >0. Furthermore, t v{t, x) is 
continuously differentiable for every fixed x e D, and it is immediate that the bounds 
are satisfied due to the fact that the derivative of the solution approaching 0 or ||/||o 
becomes positive or negative, respectively. Now, using an approximation result of 
smooth solutions converging to the L^(D) solution, we obtain that even in this less 
regular case the pointwise bounds hold almost everywhere. 



Appendix B: Comparisons of Moment Equations 

In this section, we discuss the moment equations for the L^(D)-valued jump Markov 
processes v" = v" (0"). These can be derived from the corresponding moment equa- 
tions of the jump Markov process (0f )f>o taking values in N^. This process is anal- 
ogous in structure to the usual model used in chemical reaction kinetics, cf., e.g., 
[21]. Thus, we can use the formulae derived in this reference to obtain, e.g., for the 
mean the system of differential equations 



Furthermore, it is straightforward to state a system for the second moments, how- 
ever, we are not so much interested in the moments of the Markov chain model, 
but those of the L^(£))-valued processes (v'/)t>o, which we can compare to the 
Langevin approximation. As v" is a linear mapping from into L?{D), it holds 
that v"(E"6)f ) = E"v"(0,") and i;"(^E«0f ) = ^E«v«(0f), and thus 



For the second moments of the L^(D)-valued process, we obtain for all (p e L^(D) 




k=l \/=l 



-E"v'; = --E"< + -E"F"{v';,t 



dt ' X ' r 



(B.2) 
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= 2lE"[(0,u,%(0,-yr + F"(vr,r)),,] 



+ -E" 
r 



~ P 2 



= -E"[(0,vr),,(0,-<+F"K,o).2] 

+ E«(G"((9;',r)0,0)^„ 



(B.3) 



where the bihnear form (G"(0", t)(f), 0)^2 is as defined in (4.29). 

Next, we state the moment equations for the stochastic particil differential equa- 
tions. We assume that the Langevin approximation (2.22) possesses a (strong) solu- 
tion in an appropriate Hilbert space H and employ the Ito-formula Sect. 4.5 in [12] 
which yields for all 0 e H* 



(0 
and 



Jo 

+ 2 [ (cl>,V,)HU,--V, + -FiVs,s)) ds 
Jo \ r r 

+ el f {ct>,G(V,,s)(P)f^ds. 
Jo 

Next, we take the expectation both sides of these identities and differentiate with 
respect to t resulting for the first moment in the differential equations 



-E(0,y,)H = (</>,E 



--V, + -F(V,,t) 

X X 



which is equivalent to the abstract evolution equation in H given by 

d 1 1 

— EV, = EV, + -Ef (Vr, t). 

dr XX 

And for the second moment, we obtain the differential equation 



(B.4) 



^E{(/.,y,>2^ = ^E[(0,y,>fl((/.,-y, + F(y„oy+e2E[(G(y,,O0,0y. (b.5) 



Further, the linear noise approximation (2.21) satisfies the equations 
d 



dt 



EC/f = --Ejy, + -Ef (C/,,0 

X X 



(B.6) 
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and 

^E{<p, Ut)\ = Je[(0, U,)h{<P, -U, + F(Ur, t))^] + e^{G{vit), t)ct>, 0)^. (B.7) 

Finally, we note that exactly the same moment equations hold for the variants of the 
linear noise and Langevin approximation using a Q-Wiener process and an appropri- 
ate diffusion coefficient, cf. Remarlc 2.4. 

A comparison of the moment equations (B.l), (B.4), (B.6) for the mean and (B.3), 
(B.5), (B.7) for the second moments show that they are similar in structure, but do 
not coincide. This is analogous to the properties of the moment equations in finite 
dimension and as in finite dimensions there is one exception, which is the case of 
first order transitions: If F were affine^ in v, i.e., F(v, t) — f\{t) ■ v + fiit), then 
we obtain that the first moment equations (B.4) and (B.6) of the Langevin and linear 
noise approximation, respectively, reduce to the Wilson-Cowan equation with v(t) = 
EVf = EUi. Furthermore, if F is affine, this implies that also G is affine in Vf, and 
thus 

(0, GiV,, t)<p)^ = ^{0, V, ■ <P)h + {cp, flit) ■ Vr ■ 0)^ + (0, flit) ■ 4')h- 

Taking the expectation on both sides and assuming interchangeability of the expec- 
tation with the application of all the linear forms (think of the duality pairing as the 
inner product in L^iD)), we obtain 

E{GiVr, t)(P, cp)^ = ^{cPMV,] ■ (P)h + (</>' ■ E[V,] ■ <P)^ + {(P, flit) ■ cp)^ 

= {G{E[Vr],t)(P,cp)^. (B.9) 

As EVt = EUt = v(f), we obtain that the second moment equation for the Langevin 
approximation and the linear noise approximation coincide. Moreover, they are 
closed (for each (p), i.e., the system depends only on EV, and E((p, Vt)\. Again, 
this corresponds to the well-known case from finite-dimensional chemical reaction 
kinetics. 

Finally, if F is affine also the connection of the moment equations for the resulting 
Markov chain models is interesting. On the one hand, the equation for the mean 
coincides with the Wilson-Cowan equation where the gain function in its right-hand 
side is given by F . As F is essentially a piecewise constant approximation to F , the 
resulting equations for the mean correspond to a spatial discretisation of the Wilson- 
Cowan equation, cf. the continuum limit in the derivation of the mean field equation 
in [5]. 



In the case of an affine function F{v,t) = f\(t) ■ v + /2(0. the mapping f\ is a linear form on H, wliich 
is intercliangeable witli tlie expectation operator. For example, in the simplest case, the application of fi 
to I) 6 // is just a multiplication by a scalar. 
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